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OSKAR ANDERSON, 1887-1960 


By Herman Wo.p 


University Institute of Statistics, Uppsala 


Born 2 August 1887 in Minsk, Russia; deceased 12 February 1960 in Munich, 
Germany. These dates span a web of drama and colour both in personal life and 
scientific career. The course of outer events in Oskar Anderson’s life reflect the 
turbulence and agonies of a Europe torn by wars and revolutions. His scientific 
work, always marked by personal involvement, is of sufficient stature to be of 
lasting interest, in part along with the epochmaking developments in statistics 
during the first decades of this century, in part independently of these develop- 
ments. Some of Anderson’s endeavours were ahead of his time, along lines that 
have not yet received adequate attention. Thus his emphasis on causal analysis 
of nonexperimental data is a reminder that this important sector of applied 
statistics is far less developed than descriptive statistics and experimental analy- 
sis. In an appraisal of Anderson’s work, this aspect is highly significant. 

Anderson’s ethnic origin was Baltic-German. We follow him from his school 
years in Kazan, where his father was university professor of Finno-Ugric lan- 
guages. He graduated from secondary school in 1906 with a gold medal, studied 
mathematics for a year at Kazan university, and entered in 1907 the Economic 
Faculty of the renowned Polytechnic Institute of St. Petersburgh (now Lenin- 
grad), and studied economics for five years. His interests were in the broad 
area that connects economics and statistics, and in these formative years he 
developed two main specialities: time series analysis and sampling surveys. As 
a pupil of A. A. Chuprov he submitted in 1911 a diploma thesis on correlation 
analysis of time series data. In the summer of 1915 he did field work as sampling 
surveyor, participating in a scientific expedition to Turkestan for an economic- 
technical study of the irrigation system of the Ferghana oasis. During the years 
1912-17 he was teacher in a commercial secondary school in Petersburgh. During 
and after the Russian revolution he moved about, first inside Russia and then, 
leaving his country as a refugee, working as a teacher and scientific specialist. 
As statistician in a big cooperative center in the Ukraine he edited a number of 
monographs on the economic conditions in South Russia; in 1918 he qualified 
for the habilitation degree in mathematical-statistical methods at the Institute 
of Commerce at Kiev; at the same time he worked at the Demographic Institute 
of the Ukrainian Academy of Sciences; via Constantinople he came in 1921 to 
Budapest, where he founded and led a secondary school. From 1923 onwards he 
was a member of the Supreme Statistical Council in Bulgaria, the country 
where in 1924 he found stable ground under his feet. During the years 1924-34, 
at the Institute of Commerce at Varna, he taught statistics and several economic 
subjects, from 1929 as professor of economics and statistics. Then follows a period 
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of intense activity. He goes deeply into econometric research, and in 1932 be- 
comes one of the founders of the Econometric Society. In 1933 he goes to Ger- 
many and England on a Rockefeller stipend. In 1935 comes his statistical text- 
book Introduction to Mathematical Statistics, published in German. From 1935 
he is director of the Statistical Institute for Economic Research at the State 
University of Sofia. Under his directorship, the institute publishes some 50 
monographs and books on the economic conditions of Bulgaria; in several 
capacities—one being statistical expert to the League of Nations—he writes 
many articles and memoranda on statistical methods. In 1940 the Bulgarian 
government sent him to Germany to study the system of rationing. In 1942 the 
University of Kiel called upon him to become professor of statistics; moreover, 
he headed the department for Eastern Studies at the Kiel Institute of World 
Economy. From 1947 he was professor of statistics at the University of Munich.’ 

Dangers and hardships were Anderson’s lot in World Wars I and II. When 
leaving Russia he lost a daughter, and a son died not long afterwards. A second 
son died in World War II as a paratrooper. Anderson was shattered but not 
crushed by the hard blows of fate. It is characteristic of his moral integrity that 
he did not allow politics to interfere with his scientific work, and his loyalty in 
personal contacts was beyond praise. Typical instances are on record, from the 
refugee years around 1920 as well as from the Nazi period in Germany. 

Dominant features in Anderson’s scientific profile are his intense engagement 
in his work, and his strong belief in the mission of statistical method in the socio- 
economic area. In particular, there is first the large volume of Anderson’s pub- 
lished work: in all some 150 items if minor articles and book reviews are in- 
cluded. The appended bibliography is a selection, in the main compiled from 
lists edited by Anderson himself.’ There is further the high level of aspiration: 
in theoretical research he made significant contributions towards developing 
new approaches, and his applied work is marked by a keen desire to make full 
use of the best possible techniques. Typical in this respect is his systematic use 
of random sampling in the surveys in Turkestan in 1915 and later in Bulgaria 
(1929d). Best known among his theoretical contributions is the variate difference 
method, which was introduced independently by Anderson and “Student’’- 
Gosset in 1914.° Briefly stated, when studying the intercorrelations, interregres- 
sions etc. of a set of time series the device is to analyse not the series themselves 

1The present account of Anderson’s life borrows material from his pupils, to whose 
obituary articles [1]-[4] reference is made for documentation and further details. For reading 
my article in manuscript and for the ensuing helpful comments, especially towards an 
appraisal of Anderson’s work, where my views are more independent, I am indebted to 
Professors O. Anderson, Jr., Mannheim; E. Fels, Pittsburgh; R. Gunzert, Frankfurt a. M.; 
H. Kellerer, Munich; 8. Sagoroff, Vienna; and H. Strecker, Tiibingen. 

2 See [5] and the 3rd edition of his second text-book (1954a). 

+ “Student”’ [6] was first to present and apply the device, while Anderson (1914) has the 
priority in making use of mathematical expectations to establish its rationale; see also 
(1929c), p. 58. The new point was the use of successive differences; first differences had 
been used earlier in regression analysis. For later developments, see [7]. 
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but their consecutive differences with regard to the time variable; typical as- 
sumptions are that a given time series x, may be written 


(1) x, = P(t) +e, t = 0, +1, +2,::-, 
where P is a polynomial in ¢ of finite order, and the residual component e, is a 
sequence of random variables that are independent and all have the same dis- 
tribution. Third, there is the polemical pitch in many of his articles. The use 
and abuse of index numbers is a favourite topic (1937), (1950c), (1952). A 
consequential contribution of the 1920’s is his criticism of the Harvard business 
barometer (1929b), his main argument being that the underlying time series 
decomposition was a shallow and too mechanical approach. Fourth, and finally, 
I refer to Anderson’s educational work. His statistical credo is voiced in his two 
textbooks (1935), (1954): the great responsibility of the statistician is to obtain 
accurate data, and to use sound methods to analyse the data. At Munich, in the 
last period of his life, educational problems were in the center of his interest 
(1949d), (1956a). It is largely thanks to Anderson’s initiative and efforts that 
Germany after World War II has been making headway in restoring and de- 
veloping statistical teaching in the socioeconomic sciences. 

The main strength of Anderson’s scientific oeuvre lies, I think, in the systematic 
coordination of theory and application. Only to a relatively small extent does 
his importance derive from specific contributions, such as the variate difference 
method, or his work in the 1950’s on nonparametric methods (1953a), (1955b), 
(1956b). His most fruitful period was the early and middle 1930. The peak is 
perhaps marked by his paper on the quantity theory of money (193la). The 
paper is pioneering in subjecting the theory to statistical tests on the basis of 
time series data, and is of considerable historical importance also because his 
articulate discussion of residuals and their properties sheds light on the gradual 
evolution of regression methods. Anderson writes the basic relation in two ways, 


(2a-b) M; = KP, + 1; P; = (1/K)M, — (9i/K) 


where M; is the money in circulation in the ith time period, K a constant, P; the 
price index, and 7; an error term that he refers to as a “disturbance” (Stérung) 
and interprets as a random variable. Relation (2b) is statistically estimated by 
the regression of P on M, and in a key passage (pp. 538-541) Anderson postu- 
lates that »; has mathematical expectation zero, and says that (2b) “follows 
immediately” from (2a). This last conclusion shows that Anderson deals with 
the residuals as measurement errors, as “errors in variables,” not as “errors in 
equations” that would allow the twofold interpretation of being due to neglected 
causal factors, and of having zero expectation since they constitute the devia- 
tion from the conditional expectation of the left-hand variable. More precisely, 
the residuals cannot be interpreted as “errors in equations” both in (2a) and 
(2b), for conditional mathematical expectations and theoretical regressions are 
not reversible in the sense of (2a—b), as has been well known since the begin- 
nings of correlation theory [8]. Thus we see from (2) that model construction 
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had begun to take deviations between theory and observation into explicit 
account as random variables, but the statistical implications were only partly 
understood. It is tantalizing that Anderson came very near to an explicit formu- 
lation of the question whether it is M that influences P or P that influences M, 
two hypotheses about causal directions that can be formulated as in (2a-b), 
and equally tantalizing that only a few years later Holbrook Working [9] found 
a statistical device that can be used for discriminating between such causal 
hypotheses, a device that was left unnoticed for some 25 years.* 

It is no easy task to coordinate theory and observation in applied work. 
Anderson was well aware of the difficulties. In this vein is his constant warning 
that ever so refined statistical techniques are of no use unless they are applied 
to reliable observations. In the same vein is his critical attitude towards the 
modern tendencies of developing statistical theory for theory’s own sake. His 
sneers in this direction had a special sting when referring to some of the lofty 
developments of econometrics. To comment upon this last point, Anderson’s 
scepticism, valid or not, was partly intuitive. Econometrics in the 1920’s and 
early 1930’s was a melting pot for new developments, but the time was not 
yet ripe for an adequate treatment of some of the ensuing problems. The situa- 
tion is amply illustrated by Anderson’s work on the variate difference method. 
The residual assumptions in (1) are often too narrow; possibilities for a rigorous 
treatment of more realistic assumptions (such as autocorrelation in the residuals) 
did not arrive until 1933 when Kolmogorov [10] strengthened the mathematical 
basis of probability theory and thereby laid the foundations for the theory of 
stochastic processes. Another case in point, more important with regard to the 
general developments in applied statistics, is Anderson’s emphasis on correlation 
and regression methods for purposes of causal analysis. In accordance with the 
general trend of econometrics he makes a gradual shift from correlation to re- 
gression, as is clearly seen from his textbooks of 1935 and 1954. Similarly, his 
early works (1929c), (1931b) involve half-truths in line with the famous dictum 
“Correlation is not the same as causation’’; later on he realizes that regression 
analysis is an important tool for the empirical assessment of causal relation- 
ships. His treatment of the basic questions is somewhat vague and intuitive, 
and to some extent it had to be at the time. As illustrated by (2), model builders 
had begun to take residual errors into explicit account; the transition from exact 
to disturbed relationships was a radical generalization of the model, and so was 
the ensuing reinterpretation of exact forecasts as stochastic forecasts in terms 
of conditional expectations; the generalization had implications at a basic level 
that could be understood and developed only gradually. There is here a direct 
connection between the situation in (2a—-b) and the basic problems about 
“simultaneous equations” that later on have been much discussed in econom- 
etrics.5 For example, if we consider a theoretical autoregression, say 


4 See [16] for a detailed discussion. 
* Specific reference is made to the dualism between causal chain systems [11] vs. inter- 


dependent systems [12]. For a review and development from the present point of view, see 
[14]-[16]. 
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(3) Ye = anuite with E(m| a4) = ama, t= 0, +1, +2, 


° ’ 
it follows under very general conditions (a) that a can be consistently estimated 
by the least squares regression of y, on y:;, and (b) that 


(4) Ys = amet e with E(m | M2) = ems and « = & + ae-y. 


A rigorous deduction of the substitutive relation (4) requires some general 
theorems on conditional expectations and stochastic processes first established 
by Kolmogorov [10]. 

Oskar Anderson in his most active years was one of the leaders of econometrics, 
and thereby a pioneer in a broad sector of applied statistics: causal analysis on 
the basis of nonexperimental data. The same period, say from 1915 to 1940, 
was one of epochmaking developments in other sectors of statistics, with R. A. 
Fisher and J. Neyman for leading names, developments that in common parlance 
constitute “modern statistics” and are too well known to be elaborated here. A 
point I wish to stress is that the powerful methods of “modern statistics” are 
primarily designed for three broad sectors of applied statistics: (i)-(ii) deserip- 
tion and causal analysis on the basis of experimental data, and (iii) description 
(by sampling techniques) on the basis of nonexperimental data. Sector (iv), 
causal analysis of nonexperimental data—an area where the model builder is 
confronted with more difficult problems in specifying the stochastic structure 
of the models as well as in their statistical treatment—has long been neglected 
by the cadre of professional statisticians.° This is clearly seen if Anderson’s 
textbooks, with their emphasis on sector (iv), are compared with the textbooks 
of ‘‘modern statistics”, with their emphasis on the three other sectors. In the 
last ten years or so sector (iv) has gradually come forward, but it is still relatively 
underdeveloped. 

We have described Anderson as a pioneer in a difficult and important area of 
statistics, or perhaps as a forerunner rather than a pioneer, for the area was not 
yet ripe for systematic development. The handicap only makes his work so much 
the more significant, and so do other handicaps of a more local nature. One is 
the antitheoretical attitude of statistical science in Germany in the beginning 
of this century. After the flourishing period of German statistics in the 19th 
century with names like Lexis in social statistics and statistics in general, Becker, 
Knapp and Zeuner in demography, Paasche and Laspeyres in economics, Weber 
and Ebbinghaus in psychology it is something of a mystery how the development 
could stagnate so rapidly. And not only this; the socioeconomic sciences in 
Germany were the arena of an unfruitful struggle between two lines of thought. 
A typical example is sociology, where the “historical” school had Max Weber 
as leading name, and the “systematic” school was headed by Georg Simmel. 
What I am thinking of here is that model building was almost completely non- 
existent in the camps that were lined up in the “Methodenstreit”, while—on 
the contemporary international scene—model building had already become the 
vehicle for steady progress in economics and econometrics. It would seem that 


6 See [13] for an elaboration of the argument. 
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Anderson’s contributions in the direction of model building were hampered by 
the ‘‘Methodenstreit’”’. Yet the germs are there, and, even if the seedlings got 
mixed with some weeds, in a general statistical setting that allows us to view 
the principles and methods at issue as applicable not only in econometrics but 
over the entire area of nonexperimental model building. These germs emerge 
as Anderson’s most valuable and important contribution. I wish to pay per- 
sonal tribute to the inspiring influence of this aspect of Anderson’s work. 

Oskar Anderson’s scientific status was marked by several distinctions, among 
those: 

Honorary Doctor at the University of Vienna; 

Honorary Doctor at the Institute of Economics, Mannheim; 

Honorary Member of the Royal Statistical Society; 

Honorary Member of the German Statistical Society; 

Founder and Fellow of the Econometric Society; 

Member of the International Statistical Institute; 

Fellow of the American Statistical Association; 

Fellow of the Institute of Mathematical Statistics. 

Anderson was a man of grandeur, both in his work and his personal appear- 
ance. His tall, handsome and somewhat stout figure was seen at several scientific 
meetings after World War II. Particularly dear to me are the memories from 
the Scandinavian week at Munich University July 1958, when I had the privilege 
of visiting him in his own milieu: the institute that he had founded, his graduate 
seminar, and his large group of students. 
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ON FIDUCIAL INFERENCE! 
By D. A. S. Fraser 


University of Toronto* 


1. Introduction. The subject of fiducial probability was introduced thirty years 
ago by R. A. Fisher. In the original paper [8] entitled “Inverse probability” 
Fisher discussed the importance of the maximum likelihood method and then 
produced a fiducial distribution for a parameter in roughly the following manner. 
Let T be a maximum likelihood estimate of a parameter @. The distribution 
function for T given 6, F(T | @), has a uniform distribution on the interval 
(0, 1}. Differentiating partially with respect to T gives the probability density 
function for T given @: 


Os | 
ap PME | 8) | 


Differentiating partially with respect to @ gives a function treated as a density 
function for “the fiducial distribution of a parameter @ for a given statistic T.” 
From this density function, ‘‘fiducial limits” for the parameter @ given T’ can be 
calculated. 

As an illustration Fisher treated the correlation coefficient r for sampling 
from a normal bivariate population having correlation p. The supporting inter- 
pretation for the fiducial method in this example seems to me very much like a 
present-day confidence argument. This, I gather, led Professor Neyman in 1934 
[14] to present his theory of confidence intervals as an extension of the fiducial 
method. Both Fisher and Neyman have since emphasized that the theories are 
different and the recent literature stands in testimony to the large separation 
now existing between them. 

Today I shall review some of the problems that have been analyzed by the 
fiducial method and discuss briefly some of the results obtained for these prob- 
lems; also, I shall put forward a mathematicai framework’ within which I feel 
fiducial probability has a clear frequency interpretation for a large class of prob- 
lems. A natural beginning is Fisher’s [2] statement: ‘““By contrast, the fiducial 
argument uses the observations only to change the logical status of the parameter 
from one in which nothing is known of it, and no probability statement about it 
can be made, to the status of a random variable having a well defined distribu- 
tion.” Such statements have perturbed many mathematical statisticians. 
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Annual Meeting of the Institute of Mathematical Statistics in Stanford, California. The 
address was prepared on the invitation of the IMS Committee on Special Invited Papers. 

2 Present address: Bell Telephone Laboratories, Murray Hill, N. J. 

’ The development and the proof will be found in [10]. 
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2. A simple example yielding a frequency interpretation. Consider a sample of 
n observations from a normal distribution with unknown mean y and known 
variance. A sufficient statistic is the sample mean Z. For simplicity suppose the 
known variance to be such that Z is normally distributed with mean py and variance 
1. The information concerning frequency distribution can be described in several 
ways. First, it can be described by means of the frequency distribution of 2; 
see Fig. 1. Second, it can be described in terms of the frequency distribution of 
the error g = Z — yu; see Fig. 2. Borrowing from notions in the theory of measure- 
ment we might then say that u was “measurable” in the sense that which can 
be observed is in error, or fluctuates, in known frequency form about yu. As a 
third way consider the following. In the light of the available information con- 
cerning frequency distribution, let u* designate possible values for the parameter 
relative to an observed Z; see Fig. 3. The statistical problem admits free transla- 
tion on the ~ axis. Consider a very large number of samples from normal distribu- 
tions within the specifications of this example. In each case translate the sample 
mean to the value in Fig. 3. The parameter values will be correspondingly trans- 
lated. Simple mathematics then shows that the frequency distribution of these 
translated means y* is normal with center at Z and with unit scale parameter. 
There is thus a frequency distribution of parameter values u* that might have 
produced the observed Z. A probability statement of the form 


Pr {@ — 1.96 < u* < # + 1.96} = 95% 


can then be made with Z fixed and with y* treated as a variable designating 
possible values for the parameter. The asterisk on the uw can even be omitted 
provided we keep the interpretation just given. Translation of any kind of re- 
peated sampling (n fixed) so that sample means are moved to the observed Z 
will yield fixed frequency (95% in the case just given) for the event, the popula- 
tion means falling in any prescribed interval about Z. The process leading to the 
above probability statement might be compared with a gambling game in which 
the dice are rolled but concealed from view; bets made; then the dice exposed. 
With honesty, or with perfect concealment, such a game would be equivalent to 
one in which the bets are made before the dice were rolled. 

In this example, the freedom of translation has, I feel, produced a precise 
frequency interpretation for fiducial probability. 


3. A generalization of the simple example. Let X be a sample space for a 
sufficient statistic x; let G be a group of transformations on X with typical element 
h; let 2 be a parameter space with parameter 6. Suppose that the spaces 9, G, 2 
are identical and hence are groups, and that the distribution of x for any param- 
eter value @ is obtained by using @ as a transformation (left group multiplication ) 
applied to a variable g with a fired distribution; g will be referred to as the error 
variable and its distribution as the error distribution. With this formulation the 
specification, i.e., the family of possible distributions, is invariant under each 
transformation belonging to the group G, indeed (hz) = (h@)g for h in G shows 
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that the variable hx has parameter h@ when zx has parameter 6. Further details 
on this generalization may be found in [10]. 
The equation z = 6g can be simply manipulated to produce 


g= 0'x. 


Since g has a fixed frequency distribution this equation shows that 6”'z is a 
pivotal quantity (a function of the parameter and the sufficient statistic that has 
a fixed frequency distribution). A simple analysis [10] then shows that @—z is 
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essentially unique among invariant pivotal quantities. The frequency informa- 
tion in the specification of the problem can then be compactly formulated in 
terms of the fixed distribution for the error variable g. From this latter formula- 
tion the distribution of z given @ can be obtained from the equation t = 6g 
and the distribution of @ given x from the equation @ = zg’. A simple interpre- 
tation of this second equation is in terms of possible values 6 for the parameter: 
let z be an observed value from the sufficient statistic; consider the sufficient 
statistic with distribution determined by some parameter value; use a variable 
transformation to transform the variable sufficient statistic into the fixed value 
x; the parameter value will correspondingly be transformed into a frequency 
distribution—the fiducial distribution given by equation @ = zg. 

Suppose now that the distributions can be described by means of density 
functions. The natural ‘‘carrying’’ measure on a group is the invariant or Haar 
measure.’ Let yw designate the left Haar measure on G; it has the property: 
u(hH) = yw(#) for all h in G and all H C G. A related measure is right Haar 
measure; designate it by v; it has the property »(Hh) = v(H) for all h in G and 
all H C G. The modular function A(h), defined by u(Hh) = A(h)y(H) (which 
holds for all H), gives the relationship between left and right Haar measures: 
du = Adv. 

Let p(g) be the probability density function for the error variable g with 
respect to left Haar measure; accordingly, the probability element for g is 


(1) p(g) du(g). 

Simple analysis [10] then shows that the probability element for x given @ is 
(2) p(0 x) du(z), 

and for @ given z is 

(3) p(x) A(x) dv(@). 


This last expression describes a standard conditional distribution for @ given x 
within the mathematical model just presented. The freedom to make transforma- 
tions can thus be used to remove the rigidity that results when seemingly undue 
emphasis is placed on the “origin” in the coordinate system. 


4. Sampling from a normal distribution. Consider a sample of size n from a 
normal distribution with unknown mean and variance. The sample mean and 
standard deviation (Z, s) form a sufficient statistic for the parameter (yu, oc), 
the mean and standard deviation of the normal distribution. For most physical 
problems from which this statistical problem might have been abstracted, the 
origin and the unit for measurement are arbitrary or conventional; a natural 
group of transformations then is that involving location and scale changes. The 
multiplication of this group then yields the representation 


* See, for example, Chap. XI in Halmos, [11]. 
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(Z, 8) - (u, o)g _ (u, a) (5. <p) , 


which expresses the distribution of (Z, s) in terms of relocation (») and rescaling 
(o) of the distribution in the standardized case; here z and x are independent 
variables, standardized normal and chi on n — 1 degrees of freedom respectively. 
The formula for group multiplication is (a, b)(c,d) = (a + be, bd). By working 
from this fixed error distribution for g it is straightforward to calculate the condi- 
tional distribution of (Z, s) given (yu, 0), or the conditional distribution of (y, c) 
given (Z, s), the latter being the fiducial distribution. These are ordinary condi- 
tional distributions resulting from the freedom of location and scale. 

A frequency interpretation for the fiducial distribution of u given (Z, s) can 
proceed as follows: let (Z, s) designate the observed vaues; consider a long 
sequence of parameter-observation combinations in which the parameter may 
stay constant or may vary; in each case relocate and rescale to bring the sample 
mean to the fixed and the sample standard deviation to the fixed s; these trans- 
formations then carry the parameter values into a frequency distribution which 
simple analysis shows to be Student’s distribution located at @ and scaled by 
s/n’. This is the fiducial distribution of uz given (4, s). 

The general expressions for probability elements in the preceding section can 
easily be specialized. Left Haar measure under the location-scale group has 
measure element dz ds/s’ and right Haar measure has element dz ds/s; the modu- 
lar function is A(Z, s) = 1/s. The probability element for (Z, s) given (u, 0) has 
the form 


; 
” _ exp | - (t — 0 | BE exp = 
(2r)'a 20? r(3(n — 1)) 
(s _ as ab _ 28(n — 1) a dz ds 


(4) 


20? 20? iis 


(Note the rearrangement so that left Haar measure is used for the measure 
element.) The probability element for (u, o) given (Z, 8) is then easily seen to be 
} 


a | -2 (4 — ] iS ew ex |-S$3* a ue] 
(Qx)ic *P | 352 +) |\TGm—1)°” Qe 
(f _ 3 _ 28(n — 1) i 1 dude 


20 20° 8 o 


This probability element for (u, «) given (%, s) admits a precise frequency inter- 
pretation in terms of possible parameter values corresponding to the particular 
(Z, s). Consequently it can be integrated to produce marginal frequency distribu- 
tions with similar interpretations for “variables” such as yu, o, wu + 1.960, always 
of course, given (Z, s). These fiducial distributions correspond to those given by 
Fisher (e.g., in his Statistical Methods and Scientific Inference [2]). 

The use of location and scale transformations has led to a mathematical 
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framework for a problem that Fisher treated in a physical setting. With a few 
deletions I quote from Fisher [9]: ‘A complementary doctrine . . . violating. . . 


the principles of deductive logic is to accept a general symbolical statement 
such as 


Pr {(@ — ts) Su S (4+ ts)} =a 


as rigourously demonstrated and yet, when numerical values are available for 
the statistics and s, so that on substitution of these and use of 5 per cent. 
value of t, the statement would read 


Pr {92.99 < uw < 93.01} = 95%, 


to deny to this numerical statement any validity. This evidently is to deny the 
syllogistic process of making a substitution in the major premise of terms which 
the minor premise establishes as equivalent. ... [This is used to support] the 
assertion that if u stands for some objective constant of nature, or property of 
the real world, such as the distance of the sun, its probability of lying between 
any named numerical limits is necessarily either 0 or 1 and we cannot know 
which unless the true distance is known to us. The paradox . . . requires that 
we should wilfully misinterpret the probability statement so as to pretend that 
the population to which it refers is not defined by our observations and their 
precision but is absolutely independent of them. As this is certainly not what 
any astronomer means and is not in accordance with the origin of the statement, 
it seems rather like an acknowledgment of bankruptcy to pretend that it is.” 
The framework I have suggested puts emphasis on the error variable “‘g” and 
on the freedom from an artificial origin in the coordinate system. In this frame- 
work there seems to me to be little basis for criticizing the fiducial method. 


5. A further generalization. The model in Section 3 can sometimes be applied 
within more general problems if there is an ancillary statistic. Let (2, a) be an 
exhaustive statistic for estimating the parameter 0; that is, 

(i) The conditional distribution given the statistic (2, a) does not depend 
on @. 

(ii) No reduction can be made in (2, a); that is, no non-trivial function of 
(z, a) exists that satisfies (i). Suppose now that a is an ancillary statistic, a 
statistic having a fixed distribution regardless of the value of 6. In this situation 
x can be interpreted as a sufficient statistic for @ given the value of the ancillary 
statistic a. In a sense a describes a situation in which one is caught, and in which 
zx is sufficient for @. 

For some problems of this type it may be possible to apply the model in Sec- 
tion 3 to the conditional problem involving the distribution of z given the value 
of the ancillary statistic a. 


6. The problem of location and scale. In 1938 E. J. G. Pitman [15] gave an 
extensive treatment of interval estimation for problems of location, of scale, and 
of location and scale. These problems lend themselves to the methods in the 





ON FIDUCIAL INFERENCE 667 


preceding sections and the fiducial distributions can be obtained very simply. 
Consider the problem of location and scale; the other problems can be treated 
similarly. 

Let (a1, --* , 2,) be a sample of n from the distribution having density func- 


tion 
1 r- 
15 ( ), 
oC og 


where f is a specified function and y, o are the parameters of location and scale. 
The density function for a sample of n is 


> Ths (#4). 


To have the transformation properties for the fiducial method, statistics of 
location and scale are needed; the sample mean and standard deviation are a 
simple and convenient choice; any other pair would serve, however. The remain- 
ing information in the sample (2, --- , Z,) can be described by the relative 
spacing of the sample values, 


m1— Tf In — £ 
eq = ght. a ’ 
8 8 


which is called a configuration statistic and describes the “shape’’ of the sample 
without reference to its location and its scale. (The elements in the expression 
for the statistic above satisfy two constraints and as a result any designated pair 
of elements could be omitted from the expression.) The invariance of the con- 
figuration statistic under location and scale changes shows easily that it has a 
fixed distribution independent of (yu, 7) and hence is ancillary; its distribution 
will however depend on the form of the density function f. The statistic (Z, s) is 
conditionally sufficient. The combination (z, a) here is not exhaustive in the 
sense that the reduction can be made from the ordered observations (2, --- , Zn) 
to the unordered observations {2 , --- , t,}; this has, however, no effect on the 
argument here. 

The approach put forth in Section 5 is to examine the problem conditionally 
given the ancillary statistic a. For this, the joint density for (Z, s, a) given 
(u, «) is needed. The Jacobian of the transformation from (2, --- , Z,) to 
(#, s, a) where two of the elements of a are omitted turns out to depend on 
(#, s) through a factor s” *. From this, it follows that the conditional probability 
element for (Z, s) given (a, u, ¢) has the form 


(6) “Ils (a — “) et? Se, 
o” {a1 o ? 
the measure element is arranged to exhibit left Haar measure. The formulas of 


Section 3 and the Haar measure results in Section 4 then produce the following 
fiducial element for (u, o) given (Z, s, a): 
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“ p= 2 n2 2 Ildudo 
o (zt)et 2) 


§ o 


The frequency interpretation of this fiducial element is in terms of possible values 
for (yu, «) relative to (, s), all given the value of the ancillary statistic a. 


7. The relationship of fiducial distributions and prior distributions.’ Consider a 
statistical problem of the form introduced in Sections 3 or 5, and suppose that 
there is an a priori distribution for the parameter @ having a density function with 
respect to Haar measure; let the a priori probability element be 


n(6) dv(@) 


with respect to right Haar measure. The analysis of Sections 3 or 5, which ignores 
prior distributions, produces a fiducial probability element 


kp(6‘x) dv(6) 


for the parameter given the sufficient or conditionally sufficient statistic z. 
The prior distribution can be combined with the observational results in two 
ways: 
By a Bayes argument. The formal joint probability element for z and @ is 
n(@) dv(@)p(@'x) du(x) 
which yields the following a posteriori element for 6 given z: 
(8) kn(@)p(@'x) dv(@). 


By a joint distribution argument. Consider the joint distribution of the prior 
@ and the possible values 6 given the observation z. Introduce the condition 
@ = 6 with respect to right Haar measure. The resulting conditional distribution 
for @ is just that produced by the Bayes argument in the preceding paragraph. 

Thus within the transformation framework of Sections 3 and 5 the information 
about @ extracted from the observations by means of the fiducial argument can 
be combined in a logical manner with prior information and be entirely con- 
sistent with the Bayes approach. 

Some other results on the relationship of fiducial probability and prior proba- 
bility are the following. Lindley [12] has proved for a real-valued parameter that 
the fiducial distribution is a Bayes posterior distribution if and only if the param- 
eter is essentially a location parameter. D. R. Brillinger in unpublished research 
at Princeton University has proved that the fiducial distribution is a Bayes 
posterior distribution if the transformations on the sample space form an r- 
dimensional Lie group and the sample and parameter spaces are r-dimensional 
manifolds. In the framework of Sections 3 and 5, setting n(@) = c in the formula 
for the Bayes posterior distribution (8) gives the formula for the fiducial prob- 
ability element. Thus the fiducial distribution is a Bayes posterior distribution if 


5 For the details of the analysis in this section, see [10]. 
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the sample and parameter spaces are identical to the transformation group and 
the distributions are given by means of density functions with respect to right 
Haar measure. 


8. Combining fiducial distributions from separate systems.’ Suppose there are 
two systems concerned with a parameter @ and suppose that each is of the form 
introduced in Sections 3 or 5. Let x be a sufficient statistic, and p(g) be the 
density function for the random “error” in a first system, and let y, g(h) be the 
corresponding ingredients of the second system. Consider the following three 
methods of combining the information from these two systems. 

By direct combination of fiducial distributions. The first system produces 
the fiducial element p(6;'x) dv(0,); the second system the fiducial element 
q(02'y) dv(@.). A reasonable method of combining these is by reference to the 
condition 6, = 6 re right Haar measure. The resultant fiducial element for @ is 


kp(0‘x)q(@'y) dv(@) 


By a Bayes argument. If the fiducial distribution from the first system is used 
as a prior distribution for the second system, the resulting posterior distribution 
is just that obtained by the first method of analysis. 

By an overall fiducial argument. If the combined observational system admits a 
sufficient statistic then the combined system has the form in Sections 3 and 5 
and yields a fiducial distribution. A modest amount of analysis shows that it is 
just the distribution obtained by the first two methods. 

If the combined system does not have a sufficient statistic it at least has a 
conditionally sufficient statistic given an ancillary statistic, since the orbits in 
the product space for (z, y) generated by the transformations in the group 
may be taken as values of an ancillary statistic and the positions on an orbit may 
be taken as the values of the conditionally sufficient statistic. The fiducial dis- 
tribution then obtained by the method suggested in Section 5 is just that ob- 
tained by the other methods. 

In the Pitman location-and-scale problem consider separate groups of observa- 
tions and treat them as separate systems. The results described above then show 
that the fiducial distributions from the separate groups can be combined by 
either of the first two methods yielding the overall fiducial distribution based on 
the ancillary statistic. Thus the ancillary statistic itself is generated by a general 
fiducial argument. 


9. Fiducial probability and statistical inference. The results concerning fre- 
quency interpretation and freedom of handling of fiducial probabilities have, I 
feel, some general implications for statistical inference. A formal statistical prob- 
lem contains certain basic information concerning possible frequency functions 
for the variables being observed. This has been called the specification by Fisher. 
If the specification has the transformation properties of Sections 3 or 5, then the 


* For the details of the analysis in this section, see [10]. 
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fiducial argument gives a frequency distribution of possible parameter values 
given the data. In this framework then fiducial probability gives an answer to the 
question D. R. Cox [5] in his 1958 paper felt that statistical inference should 
answer: ‘‘What do the data tell us about 6?” 

In addition, many statistical problems pose some question concerning the 
parameter or concerning some function of the parameter. The interpretation of 
fiducial probability in this paper suggests to me that the answering of such ques- 
tions can be and perhaps should be a separate part of the analysis which would 
use from the observations only the fiducial distribution; in fact, accepting the 
fiducial distribution as a frequency distribution of possible values for the param- 
eter allows the answering of questions concerning the parameter even in cases 
where the transformation properties do not hold for these questions. Some 
support for this suggestion may be found in the result in the preceding section, 
that the fiducial distribution can be combined in a logical manner with the prior 
distribution and still be consistent with Bayes. A statistical inference would then 
involve combining with logic and judgment the fiducial information and any 
other information—of frequency form, or of a restriction on the parameter range, 
or of personal probability form. Marginal distributions of parameters, even if 
related to fixed points in the sample space, would have a frequency interpreta- 
tion. 


10. The Behrens-Fisher problem. The specification of the Behrens-Fisher 
problem describes two samples from normal populations with unknown param- 
eters. The problem is to estimate or make tests on the difference in population 
means. For a first system let (Z, , s,) be the sufficient statistic for (4 , o,) based 
on a sample size n, and for the second system (Z2, 82) for (ue, 02) based on a 
sample size ne. 

The specification for each system admits scale and location changes and these 
would often be reasonable for the problem in a physical setting. For the first 
system repeated sampling from the random error variable yields the following 
fiducial distribution for yw : 


wn = By + thsi /Yry 


where ¢; is the variable and it has Student’s distribution with n,; — 1 degrees of 
freedom. Similarly for the second system repeated sampling from the random 
error variable yields 


Me = X2 + to82/+/Nz 


where ¢, is the variable and it has Student’s distribution with n, — 1 degrees of 
freedom. These distributions together provide a distribution for (4; , uw) and it 
has a frequency interpretation as derived from transformation freedom. 

From the joint distribution for (4; , uw) the marginal distribution of u — uw. can 
be derived in a straightforward manner; percentage points for this marginal 
distribution can be obtained from Sukhatme’s tables. 
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In reply to criticisms concerning the error rate of a test based on this distribu- 
tion Fisher has stated ‘‘it is irrelevant to the purposes of the test for the experi- 
menter is not concerned with repeated sampling from the same population.” 
In the transformation framework proposed here the fiducial distribution is 


generated by repeated sampling from the error variables and has a frequency 
interpretation given the 2’s and s’s. 


11. Confidence intervals versus fiducial intervals; an example. In 1939 B. L. 
Welch [18] considered a simple statistical problem and examined some properties 
of a best confidence region and a fiducial interval obtained from the use of an 
ancillary statistic. Let (2; , x2) be a sample of two from the uniform distribution 
on the interval 6 + 3. The statistic z = (2, — 2)/2 is easily seen to have a fixed 
distribution and hence is ancillary. The statistic 2; = (a, + 2,)/2 is then condi- 
tionally sufficient for @. 

A fiducial interval with probability 95% is 2; + 95%(0.5 — |z|); it embodies 
95 % of the permissible range for @. A best unbiased 95% confidence interval is 


z, + min {[0.5 — |zel], (0.5 + |z.| — (0.5 — 0.95/2)*}. 


The confidence interval embodies the full permissible range for @ when that 
range (1 — 2\z|) is small. 

The fiducial interval is also a confidence interval and consequently from the 
confidence point of view falls short of being optimum. Welch plots for each inter- 
val the probability that it covers a value A away from @. Over most of the range 
of A the confidence interval has a smaller probability, and nowhere larger. 

There is however another side to the comparison. When the permissible range 
for 6 is small the confidence interval embraces not 95% but the full range of 
possible values for 6. It is not hard to see what is happening: when the range of 
permissible values is short the confidence interval] takes the full range on the 
grounds that in probability there may be another occasion when the range of 
permissible values is larger and less than 95% can be chosen and still maintain 
the long run 95% average. Is the long run average more important than the 
specialized knowledge of the particular situation? 

The comparison between the intervals thus involves a conflict between a 
general principle and specific knowledge for data in hand. In his 1958 paper Cox 
emphasized the second of these. Recent papers by Buehler [4] and Wallace [17] 
are concerned with this sort of conflict and use notions of relevant subsets and 
strong exactness. There are certainly good grounds, I feel, for ignoring a general 
principle when it is in conflict with data-specific knowledge. My preference 
weighs heavily in favour of the fiducial interval for Welch’s example. 

D. R. Brillinger, in unpublished research at Princeton University, has proved 
that a fiducial region chosen to be invariant in the transformation framework is 
also a confidence region. An advantage of the frequency interpretation in terms 
of an error variable is that it is applicable when the fiducial region has its form 
based on the variable being observed, the case not covered by Brillinger’s result. 
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12. The Creasy-Fieller problem. The problem of finding an interval estimate 
for the ratio of the means of two normal distributions was examined by Miss M. 
A. Creasy [6] and E. C. Fieller [7] in papers presented to the Royal Statistical 
Society in 1954. The papers drew attention to differences between the fiducial 
and the confidence methods and provoked interesting discussion [6] [7] bearing on 
these differences. 

The specification in its simplest form describes independent variables x and y 
that are normally distributed with means yu and v and with known variance. The 
problem is to focus on the parameter a = yu/v and find some sort of interval 
estimate. 

Fieller used the pivotal quantity 


y — ax 

(1 + 2a + a’)! 
and from it obtained regions for the parameter a by manipulations standard to 
the confidence method. The regions are certainly confidence regions; Fieller, 
however, used the term fiducial for them, but there seems to be some doubt 
whether the method of derivation fits the fiducial pattern implicit in Fisher’s 
work. 

Miss Creasy proceeded differently. She combined the separate fiducial distri- 
butions for » and » to obtain a joint fiducial distribution and then integrated to 
obtain a marginal fiducial distribution for a = p/v. Fiducial limits were then 
calculated from this distribution. 

The pivotal quantity used by Fieller is not compatible with the location 
transformations natural to the variables x and y. His regions have a confidence 
frequency interpretation but they do not seem to me to have a frequency in- 
terpretation given x and y. 

From the transformation point of view Miss Creasy’s procedure extracts what 
the data have to say in a frequency sense about the parameters yu and v. This 
frequency information is then taken at its face value and used to obtain the 
marginal distribution of the parameter of interest. The fiducial distribution has 
a frequency interpretation in terms of error variables and its derivation seems to 
fit the pattern implicit in Fisher’s work. 

In this simplest form of the problem, an interval obtained by Miss Creasy’s 
method is always contained in the interval at the same frequency level obtained 
by Fieller’s method. As a result, if the intervals are evaluated from the point of 
view of repeated sampling from the same population, then the probability with 
which the Creasy interval covers the value of the parameter must be less than 
the corresponding probability for Fieller’s confidence interval and hence less than 
the Creasy interval’s fiducial probability. 

The general tenor of the discussion following the Creasy-Fieller papers 
favoured the approach used by Fieller—it had the backing of confidence theory. 


13. Regression analysis. Consider linear regression analysis with normally dis- 
tributed errors. Let a, 8, --- , 8» designate the regression parameters and o 
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the scale parameter of the error distribution. Also let a, b; , --- , bp designate the 
least-squares fitted regression coefficients and s the standard deviation about 
regression. The fitted coefficients a, b = (b,,---,b,) have a multivariate 
normal distribution with means a, 8 = (f;, ---, 8») and covariance matrix of 
specified form but scaled by o’; let n(a, b| a, 8; o) designate its probability 
density function. The standard deviation s has a x type of distribution scaled by 
a; let x(s|o) designate its density function. A natural group of transformations 
involves location changes for each regression coefficient and a scale change for the 
regression coefficients and error scale-parameter. The probability element for 
a, b, s given a, $, can be written 


. da db ds 


+2 
n(a, b | a, 8, ¢)x(s | a) +s” ot 


where the measure element is arranged so as to exhibit left Haar measure. 
Formula 3 in Section 3 then gives the fiducial probability element for a, 8, ¢ 
given a, b, s: 


n(a, b a, G, a)x(s a) .g?t? . ee da d§ do 


o 


This distribution for a, $, ¢ can be described by the equations 


a=atow, B; = b: + ow;, ¢= sv/f/x 


in terms of variables wo, ---, wp, x, the w’s having a multivariate normal dis- 
tribution with means 0 and with covariance matrix equal to the inverse of the 
matrix of inner products of the structural vectors corresponding to the parameters 
a, Bi, °**, Bp, and x being distributed as a x variable with the usual ‘error’ 
degrees-of-freedom. 

In certain cases of nonlinear regression the space for the mean of the basic 
variables may be contained in a linear subspace of slightly higher dimension. 
It then seems reasonable to calculate the fiducial distribution for the regression 
coefficients in the linear subspace and then condition it according to permissible 
parameter values. If the sample size is large enough most such problems become 
approximately linear and the fiducial method can be applied directly. 

If the error distribution form is something other than the normal, the least 
squares estimates will in general not be sufficient. The transformation method 
can, however, still be used but in relation to an ancillary statistic—a value of the 
ancillary statistic corresponding to an orbit under the transformation group. 


14. The correlation coefficient. Consider estimation problems concerned with 
sampling from a bivariate normal distribution. Let Z, 9, 81 , 82, 82 be the suffi- 
cient statistic estimates for the parameters y, v, on , a1, ox - 

In the original paper [8] on fiducial probability Fisher produced a fiducial 
distribution for the correlation coefficient p = o1:/ (ono2)* by the method 
outlined in Section 1 of this paper. In the discussion following the Creasy and 
Fieller papers [6] [7] Fisher produced a fiducial distribution for (», v). Mauldon 
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[13] obtained several alternative pivotal quantities and hence several alternative 
fiducial distributions for the parameters. Quenouille [3] in his recent book, The 
Fundamentals of Statistical Reasoning, gives a set of rules for handling fiducial 
probabilities and applies them to this bivariate problem. There seems to be a 
general lack of consistency among these results. Various transformation groups 
can be used on the basic sample space for sampling from the bivariate normal 
distribution. One group that might be natural in certain situations is formed from 
the following kinds of transformation: scale and location changes for x; changes 
in y that are of linear regression form on x; changes in scale of deviations from 
the linear regression. The resulting model is of the form discussed in Section 3 
and a fiducial distribution of the five parameters can be derived. From this 
joint distribution a marginal fiducial distribution for the correlation coefficient 
can be obtained; it will have a frequency interpretation in terms of repeated 
sampling from the error variable. 

The transformation group just given can be used with the roles of x and y 
interchanged—that is, with regression of x on y. From this, a marginal fiducial 
distribution for the correlation coefficient can be obtained and it will in general 
be different from the fiducial distribution mentioned in the preceding para- 
graph. This second fiducial distribution also has a frequency interpretation but in 
terms of a different kind of random error transformation. These two distributions 
appear symmetrically one with respect to the other and hence must be different 
from the fiducial distribution in Fisher’s original fiducial paper [9]. The only 
frequency interpretation for the original fiducial distribution that I can see is the 
frequency interpretation that customarily goes with confidence intervals. 

This multiplicity of fiducial distributions for the correlation coefficient perhaps 
reflects certain inadequacies of the correlation coefficient and hence of the 
bivariate normal for the strong statements of the fiducial method. Additional 
structure in the form of suitable or reasonable transformations on the sample 
space gives some unity in terms of relative positioning of points and can yield a 
fiducial distribution with a frequency interpretation that in effect states where 
the parameter might be relative to the observation. Perhaps this additional 
information from the physical situation is necessary before we can make the 
strong statements of fiducial probability. 


15. Addendum. Prof. Allan Birnbaum has given a paper at this meeting 
describing some results of his on the topic “informative inference.’’ Consider the 
case of two parameter values 6; and 6, and distributions that admit symmetry of 
the likelihood ratio. 


There is then a unique decomposition into simple experiments ordered by 
“more informative than.” These simple experiments are symmetrical. This model 
lends itself nicely to the transformation argument of fiducial probability. A value 
for the ancillary statistic corresponds to the particular simple experiment one 
finds oneself in. Within that simple experiment the permutation group on two 
elements can be used and a fiducial distribution on the two parameter values 
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derived. The ratio of fiducial probabilities turns out to be the ratio of the likeli- 
hoods. This fiducial distribution has a frequency interpretation given the ob- 
servation. 

This is an example of the use of fiducial probability when the sample space is 
finite. The method of Sections 3 and 5 is of course valid for finite groups and thus 
extends, in this direction, the usual usage of fiducial probability. 


16. Bibliography. Tukey [16] gives a bibliography of fiducial probability; a 
more extensive bibliography appeared in the handout for the 1958 Wald lectures 
given by Tukey at the August 1958 meeting of the Institute of Mathematical 
Statistics. 
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A CENTRAL LIMIT THEOREM FOR PARTLY DEPENDENT VARIABLES 


By H. J. Gopwin anv S. K. ZarRemBa 
University College of Swansea, Wales 


1. Introduction and definitions. It is well-known that the Central Limit 
Theorem can be extended to cases in which the random variables under considera- 
tion are not entirely independent. In particular, various theorems have been pro- 
duced with the purpose of dealing with variables which are dependent only when 
they are in some sense near to each other. The case of m-dependent variables 
(see, for instance, [13] and [16]) belongs to this category. Another case of this 
kind arises when the variables have several indices and are regarded as near to 
each other when they have at least one index value in common (for a somewhat 
special instance of this case see [9]). The importance of the latter case is due to 
the fact that it covers a large class of statistics for which W. Hoeffding [5] sug- 
gested the name of U-statistics; however, these statistics are only a special in- 
stance of it, as can be seen from the reduced number of degrees of freedom. 

The purpose of the present paper is to prove a general form of the Central 
Limit Theorem for partly dependent variables. Its statement is believed to in- 
clude, as special cases, all the hitherto published propositions on these lines, to 
cover most, if not all, the situations which have been treated ad hoc, and to go, 
in some directions, beyond the previously obtained results. As remarked by 
Feller [2], limiting distributions of normalized sums of random variables should 
not depend on the existence of moments; accordingly, no moments are postu- 
lated, and indeed the most general form of the Central Limit Theorem for in- 
dependent random variables [2] is contained in the theorem which follows. The 
statement of the latter may appear slightly cumbersome but it implies, as corol- 
laries, a variety of simpler propositions which are given in Section 3; on the other 
hand, its proof, which is a generalization of the argument in [16], and does not 
reduce the general case to that of independent variables, remains conceptually 
as simple as it would be if the argument were confined to some of the special 
cases of partly dependent variables. In order to simplify the language, the whole 
argument is stated for one-dimensional variables, but there is no difficulty in 
extending it to multi-dimensional variables; a general expression for the mixed 
moments given, for instance, in [7] is useful in applying the multivariate form of 
the Second Limit Theorem (discussed, for instance, in [10], Section 7). 

In order to avoid misunderstandings, it should be remembered that pairwise 
disjoint sets of random variables are called (mutually) independent if the joint 
probability distribution function of their union is the product of the joint prob- 
ability distribution functions of the various sets. A set of random variables will 
be called irreducible if it cannot be decomposed into two (mutually) independent 
proper subsets. But the factorization of a joint probability distribution function 
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applies also to all the corresponding marginal distributions. Hence, if the pair- 
wise disjoint sets S,,--- , S, of random variables are independent and if S} , 
++, S), are subsets of S;, ---, S, respectively, these subsets are also inde- 
pendent. It follows that the partition of any denumerable set of ranom vari- 
ables into irreducible sets is unique. 

The analysis of relations of dependence between random variables is compli- 
cated by the well-known fact (see, for instance, [1], Section 14.4) that pairwise 
independence does not imply independence in general. 

In order to overcome this difficulty, it is proposed to describe as linkedness 

any symmetrical and reflexive relation between random variables of a 

given set which satisfies the condition that any two subsets are (mutually ) 

independent whenever no variable of one subset is linked with any vari- 
able of the other. 

If two variables are not independent then they must be regarded as linked 
and, at the other extreme, we can construct the relation trivially by making 
any two variables linked. It is usually most convenient to restrict the linkedness 
as far as possible; in some of the applications listed below (Section 3) it is, in 
fact, necessary to regard variables as linked only when they are correlated, but 
in the case of the method of paired comparisons a wider grouping has to be 
linked. In the case of m-dependent variables two variables can be regarded as 
linked when their indices differ by not more than m. One can also think of a 
family of random variables with several indices and with a relation of linkedness 
equivalent to the presence of a given number of common index values. 


2. The Main Theorem. 

THEorEM. Let {x,}, with k belonging to K, be a denumerable family of random 
variables with a well-defined relation of linkedness, and K, , with t belonging to T, a 
family of finite subsets of K, S,; being the set of all the variables x, for which k be- 
longs to K,. The precise nature of K and T does not need to be prescribed; we can 
take T to be a topological space with the point ~ adjoined so that t > ~ is mean- 
ingful.' Assume the existence of a number d, a function y(m) defined for all integral 
values of m greater than 2, and a function 6(t) defined for all t in T, with the prop- 
erty that, for all m and t in T, y(m)6(t)”* is an upper bound for the number of 
sequences of elements of S; having m terms, beginning with any two arbitrarily given 
linked terms and forming irreducible sets. Moreover, assume that, for a suitable 
family of positive numbers a; corresponding to any t in T and for any positive n, 
the following four conditions are satisfied: 


(i) > Plin| >ajJ70 as t> «; 


in K; 


1 In each of the examples given in Section 3, K can be regarded as a vector space, but 
imposing on K and T any restrictions beyond the conditions of the theorem would have 
no bearing whatsoever on its proof, and indeed could only help to obscure the gist of the 
argument. 
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(ii) <2. / ljz| dF,.(x) 9-0 as toa, 
kinK, ~99(t)~1a:<|z| Say 


where F(x) is the probability distribution function of 2 ; 
(t) 


(iii) lim a;* >> / [x — BIR )y — b:7) dF,.(x, y) = 1, 
|z|. |wisnO(t) lay 


t+ (k,l) 


where F,,.(x, y) is the joint probability distribution function of x, and 21, >. {tr 
denotes a sum extended to all pairs of values of k and | belonging to K, and cor- 
responding to linked variables x, and x,, and 


bi? = | x dF,(2x); 
|z|S00(t) lay 


(iv) 0(t)? “az” 2» | s= be |ly — bi | dF, (2, y) 
\z|. |u| sv0() ~1a, 
is bounded. Then, as t — «, the distribution of the random variable 
X,; = a; 2 stan (xy ~~ of) 
tends to be normal with zero mean and unit variance. 

Proor. It is easy to see that, if the conditions of the theorem are satisfied, 
there exists a family of positive numbers e, corresponding to every ¢ in T' in such 
a way that «,—0 as t— ~, and that the conditions (ii’), (iii) and (iv’), 
obtained from (ii), (iii) and (iv) by substituting «, for 7, are satisfied. Put 

Utk = Tk, 2k = Yer = 0 if \ar»| > a > 
Fes es = Fes = 0 if €0(t) ‘a, 4 |p| S a: ; 
= Xi; Ue= 2. =O if \ar,.| S €0(t) ‘ay ; 


_ 6(t)) 
a; pa (Yer — bia" - 


kink, 


(1) X,=¥itar 2D tsatar dy wa. 


kink; kink, 


But (i) entails 


QT. Ur ~ 0/0 as toa, 


kink; 


and a fortiori 


t+ 


On the other hand, in the new notation, (ii’) becomes 


az’ > E(lenal) 0 as too. 


kink; 
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A fortiori, 


(3) ar'E(| >> zal) 70 as too. 


kink; 
Hence, by an application of the Bienaymé-Chebyshev inequality, 


(4) p lim az" rw Zr = O. 


t+o kink; 


The boundedness of each of the variables y,; ensures the existence of all its 
moments. In particular, according to the definitions of b{'?, yi, 2:4 and Y;, 


E(Y:) = —a;'E( >> 2%), 


kink, 


so that (3) implies 


(5) E(Y:)—-0 as t- 


Remembering that, according to the definition of y:., E(y:n) = b:%”, put 


(6) Ene = Ge [Yee — E(yen)) = ar [yee — 6%’). 


There is no difficulty in verifying that the mutual independence of subsets of 
S, entails that of the corresponding subsets of {£,,}; hence, in particular, it is 
admissible to regard, for any k and / in K,, &,, and &;,; as linked if, and only if, 
2, and x; are linked. On the other hand, now 


Y,-~K(Y) = > ta; 


kinky 


and, consequently, for any positive integer m, 


E(Y,-—E(¥.))"= 2) -+- Do Eléeaa +++ &acm)- 
k(1) ink; k(m) ink; 

However, the last summand can be decomposed into a product of expecta- 
tions according to the unique decomposition of £4) «++ &,cm) into irreducible 
sets, and terms corresponding to the same factorization can be grouped to- 
gether. If, in the corresponding partial sum, we neglect to omit the products of 
moments in which the sets of variables under the various expectation signs are 
not mutually independent, the summation variables under each expectation 
sign run independently of those under the other expectations and, in consequence, 
the partial sum can be factorized into a product of sums of moments. 

In any of these products, sums of first moments are equal to 0, while owing to 
(iii’) sums of second moments tend to 1. A sum of moments of any order q ex- 
ceeding 2 will be shown to tend to 0 as t — «. Indeed, this will be seen to re- 
main true even if all the moments are replaced by the corresponding absolute 
moments. In the first place it should be noted that under each expectation the 
first variable is necessarily linked with at least one of the others. Accordingly 
the terms of the sum can be grouped into g — 1 overlapping classes, and since 
the moments are absolute, it suffices to show that the sum of all the terms in 
any of these classes tends to 0. Without loss of generality, it can be assumed that 





A CENTRAL LIMIT THEOREM 681 


the first two variables are linked. The absolute values of the moments can only 
increase if the last ¢g — 2 variables are each replaced by 2¢,0(t)~*, which is an 
obvious upper bound for their values. According to the assumptions of the 
theorem, the number of terms corresponding to any combination of values of 
the first two summation indices does not exceed y(q)0(t)* * and, therefore, the 
sum is bounded by 


(7) (Qer)* *y(q)O(t)** DY E(\Eeakeal), 
(kt) 


which tends to 0 according to (iv’) and to the definition of &,, . 

There remains to show that the sum of all the terms we failed to omit tends to 
0 ast— o. It is easy to see that this sum is a linear combination, with coeffi- 
cients independent of t, of expressions obtained from products of sums of the 
type considered above by a process which could be described as one of amal- 
gamating factors. Two, or more, factors are thus amalgamated if they are re- 
placed by one sum of products of former summands, the new sum being re- 
stricted to sets of index values giving rise to irreducible sets of random variables 
(in addition to the original restriction requiring the variables under each expec- 
tation to form an irreducible set). But such amalgamated factors involve at 
least four summation indices, and, clearly, their absolute values are bounded in 
the same way as the previously considered sums of absolute moments. Conse- 
quently, the amalgamated factors, and, therefore, also the corresponding prod- 
ucts, as well as their linear combinations mentioned above, tend to0 ast — «. 

Hence, apart from terms the sum of which tends to 0, E(Y, — E(Y,))” 


is a sum of products of sums of second moments. Since no such products can 
arise when m is odd, 


(8) limi.«0 E(Y, — E(¥:))" = 0 when m is odd. 


If m is even, we are left with m!/[(4m) 7 a products of sums of second moments, 
arising out of the same number of possible partitions of m variables into 4m 
irreducible pairs. Since each of these products tends to 1, 


. 7 , , , aim . 
(9) lim;.. E(Y, — E(¥:))” = m!/[(4m)!2’"| when m is even. 


The limits, given by (8) and (9), of the central moments of Y, are those of a 
normal distribution with zero mean and unit variance. Consequently, according 
to the Second Limit Theorem (see, for instance, [3]), this is the limiting dis- 
tribution of Y, — E(Y,). Finally, owing to a proposition given by Cramér 
({1], Section 20.6) and as a consequence of (1), (5), (4) and (2), this is also 
the limiting distribution of X,. Hence the proof is complete. 

Remark I. If the moments of a: > in K, (Zte + Us%) Of some even order 
m tend to 0 ast — x (which can easily be expressed in terms of the joint prob- 
ability distribution functions of m variables {z,}), then, owing to the Hélder 
inequality and to the results obtained while proving the main theorem, the 
moments of X; up to the order m tend to the corresponding moments of the 
limiting distribution. 
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Remark II. The argument above also yields a form of the weak law of large 
numbers for partly dependent variables. More precisely, if, in the statement of 
the theorem, the second condition is relaxed by requiring only that (ii) should 
hold for some positive » and if the last two conditions are replaced by the con- 
ditions that, for the same 7, 

(t) 


(v) lim az* DY / vtgeeceyta, ORI Ly — ber] dFi.u(z, y) = 0 
1Z\5 ” at 


t+o (k,l) 


then the conclusion is that 
p limy.. X; = 0. 


Indeed, if in the definition of y,, and 2:4, € is replaced by 7, the arguments 
proving (2), (4) and (5) still apply, while (v) is equivalent to 


Li. mgs Ye — E(Y,) = 0; 


the proposition follows immediately from the last relation in conjunction with 
(1), (2), (4) and (5). 


3. Special cases and applications. An important particular case of the main 
theorem arises when d = 2 and 6(t) is an upper bound for the number of elements 
of S, which are linked with any random variable belonging to this set; then we 
can take y(m) = (m — 1)! and the last condition of the theorem is simplified 
by the omission of the factor 6(t)?~*. 

In one of the main applications, K is the set of all the sets of, say, r positive 
integers, and K, is the subset of K determined by the requirement that all these 
integers should be less than or equal to t, two variables being linked when the 
two index sets have at least one element in common. More particularly, if the 
joint probability distribution of any (finite) number of variables depends only 
on which indices have the same value and not on that particular value or on the 
values of the other indices, if the variables have finite second moments, and if 
linked variables are correlated, then the conditions of the theorem are satisfied, 
provided that we put 


a; = var Z, %. 


kin kK; 


Indeed, then, a, = O(t' ') and 


P{| Xe | > ay] < a*s 2 dF, (xr) -s o(t’”). 
[ze | >a 


Hence 


Dd, Plixe| > ai] = o(t'”), 


kin kK; 


which proves (i). Furthermore, 6(t) = t’ — (t — 1)" so that 06(t)‘a, = 
O(t). Thus 
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e 


z\dF,(xz) s / | a2 | dF, (x) 


|z|>96(t) 1a, 


(10) n0(t)—1a:<|z|saz 


< 1 'e(t)az" [| 2’ dF,(z) = O(¢4), 


z|>9@(t)~a, 


and consequently 


(11) ay nt || dF,(x) -0 as t— o, 
|z|>n@(t)—lay 


kink, 
which shows that (ii) is also satisfied. As a consequence of (ii), 
, (») 
E(x.) —biy ~O0 as to 


and this, combined with (10), implies (iii). Finally, it is easy to see that the 
sum in (iv) is O(t’"’), ensuring that this condition is also satisfied. 

Consequently, given a family of independent and identically distributed ran- 
dom variables {Xz} (7 = 1, 2,---) as well as a symmetrical function f of r 
arguments, such that f(X,, --- , X-) should have a second moment and f(X,, 
Xo, +--+, X,) and f(X, , X+4:, -+- , X21) should be correlated, the conditions 
of the theorem will be satisfied by making f(Xq),---, Xie) = %, with k 
denoting the set {7(1), --- , 7(r)}. Apart from the special form given to the co- 
variances in [5] and from the fact that the theorem is stated, there, in its multi- 
dimensional version, this is Hoeffding’s limit theorem for independently dis- 
tributed variables. It should be noted that the case when f(X,, X2,--- , X,) 
and f(X,, X+4:1, +++, Xe) are uncorrelated is a trivial case of Hoeffding’s 
theorem since, owing to his choice of the normalizing factor, the variances tend 
to 0. On the other hand, in the statement above, the random variables {X,} can 
have any number of dimensions, and if several functions are simultaneously 
considered the statement applies to any linear combination of these functions, 
implying an asymptotically normal joint distribution of the functions themselves 
(see, for instance, Section 7 in [10}). 

As pointed out by Hoeffding, the statements above can be applied to a whole 
range of statistics, such as Gini’s mean difference, Gini’s coefficient of concen- 
tration, functions of rank and of the signs of differences of random variables, 
difference-sign and rank correlations in samples, tests of independence, etc. 

In general, when the random variables in question have a common upper 
bound, i.e., when P{|z,| > A] = 0 for some A and all k in K, the first two 
conditions of the main theorem are automatically satisfied, provided that 
6(t) a, + «© as t— > ». This applies in particular to the test function in Wil- 
coxon’s test (see, for instance, [15]). Given two samples z,, ---,2,andy,---, 
yn of two random variables, the test function is 


(12) ve SD 


t=1 k=l 


where 





H. J. GODWIN AND 8S. K. ZAREMBA 


Zak [ > Ye 


0 2 até th. 


The scope of the test will not be discussed here (see, however, [4], [11], [14], 
[18]). The theorem proved by Hoeffding in [5] cannot be applied here because 
the parts played by the two different sets of variables are not symmetrical. 
However, under the assumption that the two samples arise from two identically 
distributed populations, the asymptotic normality of U was proved in [I1] 
(where the distribution for small samples was obtained as well), and could be 
deduced from a theorem in [6]; without this assumption, but with the restriction 
that m/n be constant, it was proved in [8]. On the other hand, apart from the 
trivial case when the distributions do not overlap and so U is constant, the asymp- 
totic normality of U follows from the main theorem of the present paper under 
any hypothesis about the distribution of the two random variables in question 
and without any requirement on m/n; t becomes a two-dimensional vector (t, , t2) 
tending to infinity when both 4; — « and t, — «, and K, becomes the set of 
all the pairs (t, k) of positive integers such thati S4,k S kh. 

In some applications, the patterns of dependence of the random variables are 
fairly complicated, and it was in order to cover such cases that, in the statement 
of the main theorem, the constant d was allowed to take values other than 2. 
One such case arises in the treatment of the test function in the method of 
‘“‘paired comparisons” [12] for the investigation of the transitivity of preferences. 
The subject of the experiment is asked to choose between each pair formed with 
the entities »,,---, v,. Write P;;,;.,; = 1 if the preferences confined to »;, 
v; and », are not transitive, and P;;,;,, = 0 if they are. Under the null hypothe- 
sis that all the choices between pairs are independent and equally probable, 
Moran [12] proved the asymptotic normality of >> P;:.;.; by an argument 
ad hoc. 

The same result can be obtained by a direct application of the main theorem 
of the present paper. It is easily seen that P;;,;.,; and P;i’,;’ x’; have to be re- 
garded as linked whenever the two sets of index values have at least two ele- 
ments in common, so that @(t) = 3¢ — 8. Nevertheless, two sets of variables 
are independent if there is only one link between them. Hence we can take 
d = 3. Then y(m) is the number of possible patterns of links ensuring the irre- 
ducibility of a set of m of the variables P;;,;,, (allowing for repetitions). Further- 
more, putting 


a; = var Pi.i.k) = . var P4i.5%) ’ 
ijkSt ijkst 


we obtain a, = O(t'), and consequently, a,0(t)"' > «© as t—> », which, in 
view of the boundedness of the distribution of the variables, causes the condi- 
tions (i), (ii) and (iii) to be satisfied automatically. The last condition of the 
theorem is easily seen to be also satisfied. 

In the same way, the theorem could be applied to the distribution of the test 
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function under alternative hypotheses respecting the pattern of dependence of 
the variables set by the null hypothesis. 

If 6(t) can be taken independent of ¢, as, for instance, in the case of m-de- 
pendent variables, this simplifies the whole situation in a way which is quite 
different from the simplification due to the boundedness of the distributions in 
question. Conditions (i) and (ii) can, then, be combined into one conuition: 
limps >t in x, Pllae| > nad) = 0 for every positive n, 0(t) can be omitted 
in the statement of condition (iii), while, owing to the Schwarz inequality, 
condition (:v) can be replaced by the simpler condition that 


a” 2d / [x — bik} dF, (x) 
kink, |z|<at 


should be bounded. Thus, as a special case of the theorem proved in the pre- 
ceding section, we tind the most general statement of the Central Limit Theorem 
for m-dependent variables published so far (see [13] and [16]), and, therefore, 
as a still more special case, also the Central Limit Theorem for independent ran- 
dom variables under conditions which are not only sufficient, but necessary 


as well, at least when the normalized random variables are infinitesimal (see 
[2] and [17}). 
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SOME MULTIVARIATE CHEBYSHEV INEQUALITIES WITH 
EXTENSIONS TO CONTINUOUS PARAMETER PROCESSES' 


By Z. W. BrrnBAUM AND ALBERT W. MARSHALL 
University of Washington; University of Washington and Stanford University 


0. Summary. In this paper we obtain some multivariate generalizations of 
Chebyshev’s inequality, two of which are extended to continuous parameter 
stochastic processes. The extensions are obtained in a natural way by taking 
into account separability and letting the number of variables approach infinity. 

Particular attention is paid to the question of sharpness. To show that the 
bound of the inequality cannot be improved, examples are given in a number 
of cases that attain equality. 


1. Introduction. We begin by discussing a model for the various generalizations 
of Chebyshev’s inequality, and for a standard proof that we shall use. Examina- 
tion of this proof will enable us to make some general comments concerning the 
problems of deriving inequalities and of proving sharpness. 

Let (Q, ®, P) be a probability space, and let (9%, @) be a measurable space. 
For each i ¢ J, an arbitrary index set, let C7; C @ and let $; be a class of random 
variables on (Q, ®) taking values in (X, @) such that X ¢5; whenever Y ¢§, 
has the same distribution as X. Chebyshev’s inequality and its generalizations 
are of the following form: 


(1.1) X ¢%,; implies P!X ¢ A} S$ (A) forall Aee@; and all tel, 


where for each 7 ¢ J, ®; is a non-negative function on C; . 

For the usual Chebyshev inequality, % is the real line, @ is the Borel sets, 
I= (—, ©) x [0, ~), F,..2) is the set of all real-valued random variables 
X with expectation » and variance o’, C,,,2) consists of all sets of the form 
A, = (u — «,u + ©)° (E* denotes the complement of the set E), and 

®,,.02)(A.) = a /e. 
In Sections 2 and 3, X will be Euclidean n-space R” for some n, and @ will again 
be the Borel sets. 

Inequalities of the type (1.1) can very often be proved as follows: for each 
ie J, one defines a function f; on @; x X to R such that, for each A €@,, 

1. fi( A, -) is measurable, 

2. fo f(A, X) dP is independent of X ¢§;, 

3. Sixea; f(A, X) dP = O for all X €§;, 

4. Sixeas fi( A, X) dP = P{X € A} for all X €5;. 

Then 
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@,(A) = [ 44, x) aP > J a 84D dP => P{XeA} forall Xe5,. 


Often conditions 3. and 4. are replaced by the stronger conditions 


3’. f(A, X) 2 Ofor all ze &, 
4’. f(A, X) 2 lforallzeA. 


The above model has been presented with various degrees of clarity and 
generality by a number of authors. It is used quite extensively by Fréchet [4], 
who credits Cantelli with its first presentation. 

An inequality of the form (1.1) is said to be sharp if for all 2 in J and A in 
€, there is a sequence { X,};2» of elements of ¥; such that 


This replacement will be made in Section 3, but not in Section 2 


(6(A) if (A) <1 
lim P{X; ¢ A} = 4 
koe \1 if ®(A) > 1. 

By examining a proof of the kind described above where conditions 3’. and 
4’. are satisfied, it is often possible to find an example for which equality holds 
in (1.1) and thereby demonstrate sharpness. If X isa random variable in $; for 
which equality holds for arbitrary but fixed 7 in J and A in @;, then equality 
must hold in 3’. and 4’. Hence (neglecting sets of zero probability) it must be 
that X assumes in A(A°) only values x for which f;(A, x) = 1(f;(A, x) = 0). 
Using this determination of the values that X may assume with positive prob- 
ability together with the requirement X ¢%;, one can often find a distribution 
for X if one exists. 

When deriving an inequality of the form (1.1), there are certain procedures 
one may use to find the functions f;. For example, if the bound is to involve 
only second moments of the random variables, then f; must be a quadratic 
form. This together with conditions 3’. and 4’. may so severely limit the possible 
candidates for f; that one can write f; as a function involving only a few unknown 
parameters. Their values can sometimes be found by the requirement that they 
minimize the bound #;. Alternatively, one can begin by using the method of 
the preceding paragraph to find (in terms of the unknown parameters) the 
values that a random variable X may assume with positive probability to 
achieve equality in (1.1). The requirement that X ¢5; may then determine 
the unknown parameters. 


2. A generalization of Kolmogorov’s inequality. 
THEOREM 2.1. Let X,;, X2, --- , Xn be random variables such that 


E(\X;,| Xi oem ss Xx-1 ) = Wx| X x1! a.e.”, 
where y, = 0, k = 2,3, ---,n. Let 


2 Even though we usually neglect to mention the underlying probability space (Q, ®, P), 
we use the abbreviation a.e. to mean ‘‘almost everywhere with respect to P.”’ 





SOME MULTIVARIATE CHEBYSHEV INEQUALITIES 


& > 0, be = max (M% , Osis, OerWesWere, *** , On Il Vi), 
imé+1 


- 7, basi = 0, and let Xo = 0. If r = 1 is such that E\X;\" < @, 


, 2, °°: , 7, then 
n 


P} max a,|X;,| = 1} => : (b; —_ Visibi4s) E|X,!" 


lsksn k=l 
n 
= > d(E\X.|" — WeE|X,-1\’). 
k=l 


ReMARKs. Several known generalizations of Kolmogorov’s inequality follow 
from this theorem by setting y% = 1, X,= Y¥it ¥eo+---+ Vi, k= 
1, 2, --- ,m, and further specializing the assumptions. In particular, assuming 
E(Y,) = 0, E(Y¥:.| ¥1,---, Yeu) = Oae,k = 2,--- ,nr=2aq=---= 
a, = 1/e, one obtains an inequality given by Loéve [7, p. 386] and by Doob 
[3, p. 315]; assuming Y,, --- , Y, mutually independent, E(Y,) = 0, r 2 1, 
a, = -*+ = a, = 1/e, one obtains an inequality given by Loéve [7, p. 263]; and 
assuming Y,,---, Y, mutually independent, E(Y,) = 0, r= 2, a4, 2 a2 

= a, > 0, one obtains a result due to Hajek and Rényi [5]. 
Proor. Since E(|X;!| Xi, --- ,X:-1) = ¥x|X,—,| a.e. implies that 
E(\X;. | Xi,-+:, Xen) = Wil|Xia 2.e., 
where Xj = (sign X;)|X,|", we can take r = 1 without loss of generality. 
Let A, = {a,jX,| < 1, t= 1,2,---,k —1, alX,| = 1}, K=1,--: 


Then if 7 > k (we denote the characteristic function of a set C by xc), 


/ X; dP = E\ xa, E\|X;|| X, ae Xj} = a { XAk Vj |X jal } 
Ak 


=¥,[ \X;sl4P, 
Ak 
and by induction it follows that 


[ \X,| dP = ( II .) | X,| dP. 
k 
> 


“Ak t=k+1 
Since 2 ait (b; ~~ Viarbs41) (Tinea V;) = b; 


© 


a,, and since by = Weaibdeai, 


n n 


> bE X j| . WE \X j\] = a (0; ri ¥ 5410541) B\X ;| 


j=l j=l 


(; — Vien boas) [ [Xj] aP ZOD (by — dour bs) [|X| aP 
Ax At 


k=l jock 


(b; ae Vist bad ( IL vs) f | X;| dP 


t=—k+ 
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k=l jek t=k+ 


lb > (b; — Vins bias) ( IT Ws )as"P(A,) > on P(A,) 


= P } max 
isksn 

The proof is now complete, and we list some special cases. 
If E(X, | X1,--> , Xan) = Xp-1 &.e., then % = 1, k = 


(2.1) becomes 


n 


P} max a;,|X, < : ® (bi — bya) E|X;| 


lsksgn k=1 


= >-v(E 
k=1 


If y. ~ 0, k = 2,3, --- , n, then with the change of variables 


k—1 -1 
, yf r xf 
X,=Xi,X%,= xi (1 vs) 
t=1 
in (2.2), one obtains (2.1) after removing the primes. 
, re . . : 7+ r+ , 
If {X,, --- , X,} is a semi-martingale, then so is |X , --- , Xx} where Xi = 

max (X,,0) (see, e.g., [3], p. 295). In this case it follows from Theorem 2.1 
that 


n 
(2.3) P{ max aX; = 1} = P{ max aX%i 213} S <2 (bi — bias JE( XE)’. 
Isksn lsksn k=l 
22 :-: 2a, > 0 andr = 1, this inequality has been given by 
follows from (2.3) that if {X,, --- , X,} is a semi-martingale 


nj 


With a, 2 a 
Chow [2]. It 


n 


(2.4) 1 max aX, = 1} S D>. (bi — bess) E|X,. 
Isksn k=l 


If % = 0, k = 2,3, ---,n, then a, = & , k = 1, 2, ---, mn, and (2.1) be- 
comes 


P{ max a,|X,| = 1) < >> afE\X,!’. 
k=l 


lsksn 


This inequality was obtained by Olkin and Pratt [8, p. 234] with r = 2 and 
the additional assumption that X,, --- , X, are uncorrelated. With r = 2 it 
also appears as a special case of Theorem 3.1. 

lfa, Ss an( | [Fane +1 v;),k = 1,2, ---,n, then again a, =  ,k = 1,2, --- 
and we obtain from (2.1) 


(2.6) P{ max a,|X,| = 1} s a, E\X,|’. 
lsksn 
-_ @ 


With n = r = 2, we obtain the following from Theorem 2.1. Let X, and X; 


be random variables such that E(X;) = 0, E(X3) = 0? < «, i = 1, 2, and 


E(X,X2) = oop. If the regression of X2 on X; is linear, then for every positive 
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a, and a, 
23 22 2 
aio; + do09(1 aD 
aia 4 . 22 222 
) P{a,|X;| =>1 or aX = 1} s if aio; dz2o2p 


Pa 
22 . 22 2. 
a2d2 if a0; <s Aedeop . 


‘ 


(2. 


To obtain this, we have used the relation fo; = po. where E(X2|X,) = §X;a.e. 
THEOREM 2.2. Equality can be achieved in (2.1), so that (2.1) is sharp. 
Remark. Actually we prove slightly more than this. Even though the hy- 

potheses of Theorem 2.1 are strengthened by assuming that E(X,) = m and 

E(X, | Xi, +--+, Xen) = &Xi-1 ae. (in which case we take y% = |&|), k = 

2,3, ---,m, equality can be attained in (2.1) so long as bjE|X,\" = b,\m|. If 

the hypothesis E(X,) = m is added to Theorem 2.1 and bjE|X,\" < b,\m|, 

then (2.1) is no longer sharp, and a better bound for the case n = 1, r = 2 has 

been obtained by Selberg [10]. 

Proor. We introduce the notations E(X,) = m, wo = 0, E\X;| = wi, 

k = 1, ---,n. Since r 2 1 and E(|X;,/| Xi, °°* , Xen) J We| Xe! ae., it fol- 

lows from Hélder’s inequality that 


E\X,\" = E(E{\X.|"|X., --- , Xea}] = ElE{|Xil| Xi, --- , Xd] 


= WiE|X.|, k=2,3 


Hence bi(ui — Viner) = 0, k = 2,3, ---,n. 

Now suppose that biyi 2 b:\m| and that » bi( ui — Wime-ar) S 1, and 
consider a random vector Z = (Z,, --- , Z,) with the following distribution 
(where y, = |&|, k = 2,3, ---,m): 


z= (%,-°++, Zn) P(Z =z) 


b'(1, 6,88, a II«) 5 (bial + 6b, m) 


of = | ra 
—0'(1, 8,88, + TL&) 5 (bmi — by m) 


+ bs" (0, 1, & rr I1&) 5 (Oa us i Won ) ) 


_ ~ 1 r r rr 
+0i'(0, ->* 50,1, Seri, °°", Il é) 5 (bi (ui — Wibe-s)) 


t=—k+1 


1 r Tr r r 
+ b;,'(0, --- , 0,1) 5 (bn (un — Wan-1)) 


(0, --» ,0) 1- Do bi ui — We ue). 


Then Z,, Z2, --+ , Z, satisfy the conditions of Theorem 2.1; in fact they satisfy 
the stronger assumptions given in the above remark. 
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Suppose that for some k, b,|Z,| 2 1. Then for some j = k, 


b. = a;(] [inns vs), and on {b,|Z,| = 1}, 2; = Z.([Poness é;). 


Hence 
b.\Z.| = a;|Z;| 21 sothat maxicicn by|Z.| = 1 
implies maxi<i<n a|Z,| = 1, and 
P{max;<k<n %\Z,| 2 1} = Doras bi (ui — Wier). 


Thus the random vector Z attains equality in (2.1). 
Next suppose that biuj = b,|\m| but that >, bi(ut — vinta) > 1, ie., the 
bound of (2.1) exceeds unity. Choose ¢; , ¢2, --- ,¢n such that 0 <q S bh, 


k= 1,2,---,n, Doki ck(ut — Vint») = 1, and such that cjui = c;\m|. Then 
in the distribution of Z, replace bk by c , k = 1, 2, --- , n. For a random vector 
defined in this manner, 


> max a,|Z,| = 1} P{ max b,|Z,| = = P{ max ¢,|Z,| 2 1} = 1 


lsksn lsksn lsksn 


’ 


and the bound of unity is attained. 


3. Generalizations of Berge’s inequality. We consider now multivariate 
generalizations of Chebyshev’s inequality providing bounds for 


P{maxi<icn a;\X;| = 1} 


under assumptions regarding second moments. 

In 1919, Karl Pearson [9] published a generalization of Chebyshev’s inequality 
providing an upper bound in terms of second moments for the probability that 
a two-dimensional random vector falls outside a given ellipse. His results may 
be described as follows: Let S, be the class of random vectors X = (X,, X2) 
with E(X;) = 0, E( Xi) = ,2 = 1, 2, and E(X,X2) = oy ; let C be the class 
of sets A. C R’ of the form A, = {x = (a, 22):f.(x) < 1}, where the boundary 
of A, is the ellipse f.(x) = ev} + eer: + esx:22 = 1. Since f, is positive definite 
and not less than one on A, it follows that X «5, and A, ¢@ implies 


acit+ecd+teen={f(x)dPz2f f(X)dP = PiXeAy. 
Q {[X¢Aq} 


If A = {(2, x2): a;\x;| < 1,7 = 1, 2}, where a, and a, are positive, it is trivial 
that 


(3.1) P{a;|X;| = 1 ora, |X; 21} = P{XzA} S infsa.ce | tx) aP. 


In a special case, P. O. Berge [1] computed this bound and obtained the fol- 
lowing inequality: If X = (X,, X2) ¢%,, then for all k > 0, 


(3.2) |X| = ko, or |X2| = koo} S (1 + (1 — 2’*)'/R, 
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where p = o12/(0,02). Whenever ¢ = (01, 02, 012) is such that the covariance 
matrix is positive definite (i.e., 73 > 0, ¢1¢3 > oi) and the bound is not greater 
than one, Berge gave an example attaining equality to show that (3.2) is sharp. 
The bound of (3.1) was computed in general by D. N. Lal [6] and follows from 
(3.4) with n = 2,» = 1. 

We describe now a natural generalization of Berge’s result to higher dimen- 
sions. Use a prime to denote transpose and let F, be the class of random vectors 
X = (Xi, X2,---,X,)’ taking values in R” with moment matrix A = 
(E(X;X;)). Replace the functions f, above by quadratic forms Fy(z) = 2’Mz 
where « = (%,---,2n)’€ R”" and M is ann X n positive definite matrix. If 
Ay = {x:x2'Mzx < lj, it follows as before that for X ¢F, , 


®(M, A) = [ xx dP > { X'MX dP = P\XzAy}. 
2 {X¢Am} 
If A = {x:a,\z7,| < 1,74 = 1, 2, --- , n}, we obtain 


(3.3) P\X 2A} = inf (M, A), 


AmCaA 


which is the desired inequality. Unfortunately the bound is not easily computed. 

This generalization of Berge’s result has been recently investigated by Olkin 
and Pratt [8] and by Whittle [11]. They consider the set @ of positive definite 
n X n matrices M for which Ay = {x:2'Mzx < 1} C A and prove that there is 
a unique element M* of this set such that inf, fo Fu(X) dP = fo Fus(X)dP 


where Fy-(x) = 2M*x. They have not succeeded in obtaining M* but were 
able to characterize it as the solution of a certain matrix equation. Using this 
result they prove that the inequality (3.3) is sharp. 

It is possible to obtain many inequalities related to (3.3) that are not sharp, 
since, M ¢ @ implies P{X g A} S fo X’MX dP. An inequality of this kind was 
given by Lal [6] and a better one by Olkin and Pratt [8]. 

Extension of (3.1) to n dimensions are obtained by generalizing the sets 
A = {xe R’:a,\x,| < 1, a\x2| < 1} to n dimensions, and the sets $, to sets of 
n-dimensional random vectors. Clearly both of these generalizations may be 
accomplished in many ways other than those used to obtain (3.3). In the re- 
mainder of this section, we obtain an extension of (3.1) which differs from (3.3) 
in that only certain terms of the covariance matrix are assumed known. 

THEOREM 3.1. Let v be an integer in the interval O S v S n — 1, and let 
T,, °**, T, be integers such that ry = landntS rm Sk, k = 2,3,---,v If 
X = (Xi, ---, Xn)’ isa random vector with E(X?) = of < ©,i = 1,2,---,n 
and E(X,,Xi41) = ¢i < o7,0i41,1 = 1,2, --- , vandife > 0,1 1,2,---,n, 
then 


(3.4) P{|X;| << 6,1 =1,2,---,n} F1- > 


t=] 


ae. 4 
: + > + Gi 
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2 2 2 
ce = 4+ and ds = ci —4-—, i =1,2,---,» 
€r; €i+1 €r€i4+1 
(if v = 0, regard the empty sum of the bound as zero). 

Remark. Inequality (3.4) is applicable whenever all the moments E(X?) 
are known. It utilizes all knowledge of second moments whenever (possibly 
after a permutation of the random variables X;) every 3 X 3 principle minor 
of the matrix (H(X;X;)) has at least one unknown entry. 

Proor. We begin by assuming that «; = 1. For0 Si Sn, let 


CC; — d' 
(3.5) a; = 20; 


0 otherwise, 


ifl sis vandg; #0 


Since O36 c44 >¢giforlsis v, d; # 0 and 
dt = [(c; + 2\¢i|) (ce; — 2\e\)]' > es — 


so that ja,| < 1,0 Sisn. 
For k = 1, 2, --- ,n — 1, let 


k 
7 f 2, 2 
Fyas(2) = Fyai(11, nae » Xn) = >, (xi41 —_ QiX,; ) /(1 — a;). 


t=O 
If a, 0, 
Fias(z) = Fi(x) — tr, Tt Ate — OnTe+1) (1 — a ) + Tees ; 


using this and the relation F,.4;(2) 2 F(x), it is easily established by induction 
that F,.(2) = ri, 1 < isk. Hence 


F,(z) 20 and F,(x) 21 for 
From this and the relations 


a; —_ Pi i 2 d’ 
cae Oa 
we obtain 
Ta r - eo d' 7 : ‘ - 2 
BIF.(X)) = FST a 4 ¥ 


t=l 2d' tml d' i=l 
(3.6) 


» Rs ® 
-y& eT a / F,(X) dP = P{XeA}. 


i=l] t= 


{X¢A} 


Inequality (3.4) follows from (3.6) after the change of variables Xj = 
eX;,i = 1,2, --- ,n is made and the asterisks removed, so that the proof is 
complete. 
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If in (3.4) we take » = n — 1 and r; = i, we obtain 
P(X. < e&,¢ = 1,2, ---, 2} 


(3.7) 3 2 2 ) 4y? I! 
a 2a gs, ¢ eeu J 


where g; = E(X;Xi41), i = 1, 2, ---,n — 1. With » = n — 1 andr; = 1 
(3.4) becomes 


’ 


P{|\Xi| <«,i=1,2 


(3.8) m1 / 52 oa\? 402 F 
\ a 7 Yi 
21-5424 $+2((¢+92) - #]! 
= 1 i+1 €1€5+1 


where ¢; = E(X,Xj4:),7 = 1,2, ---,n —1. 
Note that with v = 0, (3.4) becomes (2.4) with r = 2. 
We investigate the sharpness of (3.4) only in the special cases (3.7) and (3.8), 
and only under the additional hypotheses that E(X;) = 0,7 = 1, 2, ---,n. 
Consider first (3.7); i.e., assume vy = n — landr,; = 7. 1f Z = (Z,, --- , Z,)’ 
is a random vector for which equality holds in (3.7) then equality must hold 


throughout (3.6) when X is replaced by Z. This means 
(3.9) F,(Z) = Oae. on {ZeA}, 
and 

(3.10) F,(Z) =lae. on {ZzA}. 


Since {F,(x2) S 1} is strictly convex, F,(Z) 2 1 on {Z zg A} implies that there 
is at most one root of the equation F,(2) = 1 on each plane x; = +1. Hence 
there are at most 2n roots not in A of the equation F,(x) = 1. It is easily veri- 
fied that these roots are plus and minus the columns b”, --- , b‘” of the Green’s 
matrix B = (b;;) where bj; = bj; = 8;/8; (i Sj), &: = land & = T]*2 


m=1 Am > 


s> I. 

In order that (3.9) be satisfied, Z must with probability one assume in A 
only the value (0, --- , 0)’; in order that (3.10) be satisfied, Z must with prob- 
ability one assume in A‘ only the values +b“. Thus Z must have a distribution 
of the form 


P{Z = b°} = p,/2, P{Z = —b} = pi/2, 


n * 
(3.11) Piz <0) <1 - 3 (+22). 
i=—1 \2 2 
If wu = (um, Ue, °-*, Un)’ where u; = (pi — p: )/2, then E(Z) = B’u. Since 
B\ = [] 2-1 (1 — a2) and |a,| < 1, Bis positive definite and E(Z) = (0, --- ,0)’ 
if and only if u; = 0, i.e., p; = pz ,i = 1,2, ---,n. 
Now consider the equations 


(3.12) E(Z?) = of ,i = 1,2, ---,n, E(ZZin) = ¢;,1 = 1,2, ---,n —1. 


“- 
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From (3.12) we obtain oj + oj41 — aw; = ¢;/a;, and since we require |a,;| < 1, 
(3.12) is consistent with (3.5). It also follows from (3.12) that 


| — Laks ob-1 + aback , oF — ak oi4 + a Of 
3.13 o LT Oe Tk | Tk — k 9 
( - 2(1 = at_1) . 211 — at) 


—- On wees 


-_ 


a - of 


The expressions for Pr and p, are easily verified; the other p, are obtained by 
computing (ox —_ oj we 1.) =— OL som. = onsen) and (ot _ ono.+1) —_ 
ai(ci41 — axo,) in terms of the p; . 

Since (3.13) is the solution of (3.12) it follows that (3.11) together with 
(3.13) does provide an example satisfying the hypotheses of (3.7) providing 
that p, 2 0 for all k. Furthermore, 


n n n—l1 
Spa Del oe Leh + oh +E [ll + ota)? — seit, 
t=] t=] 1 i=l 
so that the example attains equality in (3.7). 
It follows from Schwarz’s inequality that o;/02 2 |a;| and this implies p, 2 0; 
similarly, p, 2 0.For2 Sk fn—1,p, 2 Oif  — 2s + en 2B 0 
and of — 2ajors: + atoy = O. That is, p, = Oif 


9 = 9 9 


> = 73 3 co i ie 
a | O,—-1 FL OK FK+1 


| o Qe ; D2 
(3.14) es and s-= a 


ry *,* . . 2 9 ‘ 
These conditions are satisfied, e.g., for o, = ; , 2, °*:,nor for ge = 0, 
k=1,2,---,n—1. 


, 


With n = 3, the covariance matrix 


provides an example of p. < 0, since both conditions of (3.14) are violated. 
Thus we cannot claim sharpness for (3.7) under all conditions. 

If we write F,,(x) in the form F,(xr) = x’Mzx, then B' = M. In view of 
Theorem 3.7 of [8], this is as expected. 

To investigate the sharpness (3.8), let b“’, b®, --- , b‘” be the columns of the 
matrix B = (b;;) where bj; = 1, bj; = aj+raji(i ¥ j) and a, is given by (3.5) 
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with r, = 1,k = 1,2, ---,n — 1 and ap = 1. Suppose that Z = (Z;, Z:, --- 
Z,)' has the distribution 
(1) (1) 1 2 = ai (oi = ai oi) Pi 
P{Z = b”} = P{Z = —b™} ==|e5 — > OH | eS," 
2 inal 1 — at 2 


P{Z = b”} = P{Z = —b®} = Ok — Gia a1 a Phy = 2,3,--: 


ae ™ 


P{Z = 0} =1-Di pn. 


k=1 
Using the relations 

2 ee; — d}) ai e(ce; — d}) 23 29; 

ha 9.,2 ? ;= tga ’ 
293 1 + aj e;(e; - di) 
one verifies that > 7_, p, is the bound of (3.8). It is straightforward to verify 
that E(Z;) = o%,i = 1,2, --- ,n and that E(Z,Zi4:) = ¢;,1 = 1,2, ---,n —1. 
Since P{maxi<i<n |Z:i| = 1} = Doha pm, equality is attained in (3.8) whenever 
X has the same distribution as Z. Of course the example is valid only if p, = 0, 
k = 1,2, ---+ ,n. From Schwarz’s inequality, it follows that oi — at-1 oi = 080 
that p, 2 0,k = 2,3, ---,n. However, if the first and second rows and columns 
of (3.15) are interchanged, an example is obtained for which p, < 0. Thus, as in 
the case of (3.7), we cannot claim that (3.8) is sharp under all conditions. 
This example can be obtained by arguments similar to those used in investi- 

gating sharpness of (3.7). Both examples can be obtained using the results of [8]. 


4. A lemma on separability. In the remainder of this paper, results of the 
preceding sections are used to obtain some inequalities of the Chebyshev type 
for continuous parameter stochastic processes. Separability of the processes will 
of course be required (the term “separable” will be used to mean “separable 
relative to the class of all closed subsets of the extended real line”, although a 
weaker separability would suffice). From now on, the underlying probability 
space (2, ®, P) will be such that P is complete. 

If {X,,¢ = O} is a separable process and S is a countable set satisfying the 
definition of separability and containing the points 0, 7, then {supzejo,-) |X| < 1} 
is measurable and P{sup¢ejos) |X:) < 1} = P{supsesnpos) |X| < 1}. However 
for a positive function f on [0, ©) it is not clear that {supeeyo,-) [|X:|/f(t)] < 
is measurable and the following lemma is required. 

Lema 4.1. Let {X,, t 2 O} be a separable process, let f be a positive function on 
[0, ~) having at most countably many discontinuities, and let r > 0. If S isa 
countable set dense in (0, ©) satisfying the definition of separability and containing 
the set of discontinuities of f as well as 0 and 1, then {w:supieto,r) [|Xe(w)|/f(t)] < 1} 
is measurable and 


p} sup |X| «< it = lim P 4| X,| < (k — 1)f() 
Leeto.r} fd f l 


i) ae k 


(4.1) 


forall teSN(0, at. 





698 Z. W. BIRNBAUM AND ALBERT W. MARSHALL 


Proor. Let {t,;}J.1 be an ordering of S/N (0, 7] with the property that 
SUPre(0,r] INficecn |t — | 70 as n—> w. Let {8on, Sin, °**  Snny Sntint = 
{0, t,t, °°: , ta, T} where O = a. S Sin < °** < San S Satin = 7. Let 
Qin = SUPte(ep—1.n.tnn) S(t), K = 1,2, ---, n+ 1, = 1, 2, ---, and let 


n+l n+l 

Sal = Qo Abn Xeon-tmstind y+ DF (Se.m) Xtonai()s 
where for any set E, xz represents its characteristic function. By considering 
separately the case that t e S/N [0, 7] and the case that ¢ is a continuity point 
of f, it is easily shown that lim,..f,(t) = f(t) for all ¢ e (0, 7]. 

Let A, = {|X.| S fa(t) for all te SN (0, r]}, and let B, = {|X.| S f,(t) 
for all t ¢ [0, r]}. Since {X, , t 2 0} is separable and P is complete, it can be shown 
that for all n, B, is measurable and P(A,) = P(B,). Since fp = fnsi 2 f, it 
follows that A, D Ans and B, > B, 4, for all n. Hence 


(4.2) P (n An) = lim P(A,) = lim P(B,) = P (n B.). 


n=l] n> noo n=l 


| s f(t)for allt e S/N [0, r]} and 


n, B, = {\|X,.| s f(t)for all ¢ e (0, z}}. 

Now let C, = {X,. S [(k — 1)/k]f(t) for all t € (0, r]}, k = 1, 2, ---; applying 
(4.2) and (4.3) with f(t) replaced by [(k — 1)/k] f(t), it follows that C, is meas- 
urable and P(C,) = P{| X.| s ((k — 1)/k)f(t) for all t e SN (0, r]}. Since 
Cy i Crs for all k, P{supte(o,+) (| Xt |/f(t)) < 1} = limy+o P(C,) = lim;+< 

{| X.| s ((k — 1)/k)f(t) for all ¢t e S /N (0, 7]}, as was to be proved. 

We remark that the assumptions of the above lemma are not sufficient to imply 
that the set {| X,| < f(t) for all ¢ € [0, r]} is measurable. However {supicjo,,; 
(| X,|/f()) <1} c {| X.| < f(t) for all t ¢ (0, r]} so that if this latter set is 
measurable and if P{supszeto,-) (| X;|/f(t)) <1} 2 &, then P{| X,| < f(t) for 
all te (0, 7]} = ®. 


5. An inequality for semi-martingales. In this section we apply (2.3) to 
obtain an inequality for semi-martingales, and give an example to demonstrate 
sharpness. 

THeoreM 5.1. Jf {X,,t 2 0} is a separable semi-martingale such that E\X,| = 
u(t) < © for allt S 1, and if f is a non-decreasing positive function on [0, 7] 
such that the Riemann-Stieltjes integral in the following bound exists, then 
> it < »(0) +  du(t) | 

s~ 70)  » sf 


Proor. Define S, so.n3:.n, °** , $n41.n 88 in the proot of Lemma 4.1, and let 


(5.1) 


oo 
P ) sup tr) 0 
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= [(k — 1)/k]f. Since X,,,,., Xeny °**» Xenaim Satisfy the conditions of (2.4), 


ps Xen { < #00) | aa w(8;n) — (8:19) 
Wee n) e ss = fx(0) +2 Su(8i.n) ’ 


Since lim, SUPjn1,2,..-.n41 (Sin — Sin) = O and since the integral exists, 


n+l rT 
(8:.n) — w(8i-1,2) dy(t) 
l B\ Sin} ~ B\Si—1,n/ = ——-.. 
im 2 Fel8in) b Salt) 
Then since 


: j xz. t ‘ 
lim P< m l1> =P 
ne Yess aot Fel 8i, n) * 


Py 
su - > it, 
tesn fet F(t) 


we obtain 


ot eee ee uO) _ ff” dul?) 
Le he = Se ce Sta 
Hence by Lemma 4.1, 
is u(O) sf’ du(t) 
P z=i- 
LEW, 7 1 io < ‘7: f(0) SO 


If {X,,¢ = O} is a martingale and r = 1, {|X,|", t = 0} is a semi-martingale 
and it follows from Theorem 5.1 that if y(t) = Z| X. |", 


. 0) " dur(t) 
(5.2) |X | = ibs < u ( 5 
Pap, fo FO +h FO 
The restriction of Theorem 5.1 that f be monotone is not necessary; in any 
case, g(t) = inf,<.<, f(s) is monotone and g(t) S f(t) on [0, 7] so that 
. j L = (0) ” du(t) 
(5.3) P -) 
Veetorn tb DSTO +h GO” 


e[0,r] 
One can prove (5.3) sharp by replacing f by g in the example of the following 
theorem. 


THEOREM 5.2. Equality can be attained in (5.1) whenever the bound does not 
exceed one. 


Proor. Let w be a random variable such that 
Piw = wo} = [u(0+)/f(0)] + a(wo) 
where a(w) is the Lebesgue-Stieltjes integral {,4; [du(t+)/f(t)] (wedenote 
lim, , u(s) by u(t+) and similarly define u(t—)). Let 
n(t) = (u(t) — w(t—))/[e(t+) — u(t—)] 
unless » is continuous at t, in which case n(t) = 1 (define u(0—) = u(0)). The 
process {Z,,0 < ¢ S r} defined on (0, r] by 
(0, t < w, 
Zw) = } n(w)f(w), t = w, 
\f(w), T 2 t> w, 


as claimed. 
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is a semi-martingale since it has non-decreasing sample functions. — 


~F0) f(t) 
so that the process satisfies the conditions of Theorem 5.1. 

The existence of the Riemann-Stieltjes integral {> [du(t)/f(t)] implies that f 
and yw have no common discontinuity points, so that suptetor) [Z:(w)/f(t)] = 1 
whenever w S r. But P{w S 1} is the bound of (5.1) so that the process attains 
equality, and the proof is complete. 

It is possible to modify the Z, process of Theorem 5.2 to obtain a martingale 
attaining equality in (5.2). Where the sample function Z;(w) jumps to some 
value, say v, the modification jumps to v or —v with probabilities chosen so that 
the martingale condition is satisfied. 


6. An inequality for a class of second-order processes. We now apply (3.7) 
to obtain an inequality for second-order processes satisfying certain regularity 
conditions. 

The inequality, together with an ingeneous heuristic derivation, has already 
been given by Whittle [12]. Using (3.7) as a starting point we give a more 
straightforward proof, and, in the stationary case, we show that the inequality 
is sharp by defining a process attaining equality. 

The procedures used to obtain an inequality for processes from (3.7) might 
also be used with (3.8) as a starting point. If this is tried, only a trivial bound is 
obtained. 

THEOREM 6.1. Let {X,, ¢ 2 O} be a separable stochastic process with E(X,) = 0 
for all t, and let f be a positive function on (0, ©) with at most countably many dis- 
continuities. For non-negative s and t, let o(s, t) = E(X,.X:), o'(t) = o(t, t), 
and g(s, t) = o(s, t)/[f(s)f(t)]. If g has continuous third partial derivatives, then 


> . X,| 
F { sup He > 8 $5 (9(0,0) + g(r7,r)] 


+f owo;2 (2, Wier | dt. 


Proor. Since g(-, -) is symmetric, it follows that for all non-negative s and ¢, 


(6.1) 


dg(x,1 dg(x,y) 
gi(t) oa dg( y) = ag (2 ¥, 
Ox z=y=t oy r=y=t 


= g2(t) 


d'9(z,y) _ ¥9(z, y) 


t) = = 
1.260) OXOY | ray=t OYOX | rmymt 


- J2,1(t) 
gia(t) = d'9(z,y) _ F9(z,y) 


> > — o(t). 
Ox" z=y=t oy" r=y=t Os, 


Since the third partial derivatives of g are continuous, it follows from Taylor’s 
theorem that 
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g(t, t) + g(t + A,t + A) = 2g(t, t) + Algi(t) + ge(t)] 
+ 4A°[g1.1(t) + 291,2(t) + g2,2(t)} + o( A’), 


and 
g(t,t + A) = g(t, t) + Ago(t) + 4A°go2(t) + 0( A’) 
for all non-negative A and t. Making use of (6.2) we obtain 
tIg(t, t) + g(t + A,t + A)P — 49°(t, t + A)}? 
(6.3) = 2A{g(t, t)gio(t, t)[1 + 0(A?)/A7}}* = 2Alg(t, t)gio(t, t)]*11 + 0( A)/Al, 
for allt, A 2 0. 


Define S, 80.n,8in,°**,Sn4in 88 in the proof of Lemma 4.1, and let 
fx =| (k — 1)/k]f. Applying (3.7) and (6.3) we obtain 


{P max ny > y Ss. Xk = wo oat (0,0) + g(7,7) 


osign+1 fel i,n 
+ > [[9(8:.n58in) +9( 8:41.» 8:41.) — 49°(8;.0 ses) 


ke’ ( n 
— pp.) + (1,7) +2 2 (8i41.n — 8:n)[9(8isn 5 8in)912(8:.n)]}* 


(1 of cote 0( 8:41, . =) J 
Si+isn — Sin 


(6.4) 


The limit on the right side of (6.4) asn — © is 


a os (9(0,0) + g(r,7)] +f [9(t, t)gr,2(t, t)]? at, 


and the limit of the left side of (6.4) asn — © is P{supzesnjo,r) [|Xe\/fe(t)] > 1}. 
From Lemma 4.1 it follows that 


( 
X,| : X,| . 
P —"21>)=limP, sup “—1>1)><limM 
1 sup nf) ~ k->eo SP EG t) a 
and the proof is complete. 
CoroLuary 6.2. Retain the hypotheses and notation of Theorem 6.1 and suppose 


that there is a real function h such that g(x, y) = h(y — 2x) for all non-negative 


a and y. If H(-) = {1 — [h'(- )/h?(0)}}3 has a derivative H’(0) at the origin, then 
h'(0) = O and 


te[0,r] 


(6.5) P SUP 7 0} > i < h(0)[1 + 7H’(0)] = A(0) + 7[— h(0)h”(0)]?. 
Proor. The second bound of (6.5) follows directly from (6.1). If 
to» eRe = 9(4,,4) = h(0),i =1,2,-- sm, Sm gltatigs) = (— ), 
n—-1 'é —1 


a ase i+1 
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i = 1, 2,---,m — 1, then by applying (3.7) and passing to the limit on n, 
one obtains the first bound of (6.5). This bound can be rigorously established 
by showing that it is equal to the second bound. By computing H’(t) and using 
the fact that H’(0) exists, one can show that h’(0) = 0. The hypotheses of 
Theorem 6.1 imply that H’ is continuous at the origin, so that by using a Taylor 
series expansion of h(t), one obtains 


: : h'(t) | “oT 
H —=_— = | ———______ = —_ sm 
(0) 120 (h2(0) — WO ho) |’ 
and the desired result follows. 


THEOREM 6.3. Equality can be achieved in (6.5) whenever the bound does not 
exceed one. 


Proor. The bound of (6.5) depends on {X,, ¢ 2 0} only through A(0) and 
H'(0). To prove the theorem, we show that for all possible values of these 
parameters, there is a process {Z,,0 < ¢ S 7} attaining equality in (6.5) with 
E(Z,) = 0 and with 


h2(A) = E(ZZi4s)/{f(Of(t + A], H(A) = {1 — [h2(d)/h2(0)}}' 


satisfying hz(0) = h(O0), Hz2(0) = H’(0). 
Let @ = [0, r] x {—1, 1} U {(0, 0)}, @ be the Borel subsets of Q, and let P 
be the probability measure defined on @ by 


P{(0, 1)} = P{(O, —1)} = Pi(r, 1)} = Pi(r, —1)} = A(0)/4, 
P{(0, 8):0 <6 <a,5 = 1} = P{(0,8):0 <0 <a,6= —1} 
= 4ah(0)H'(0),0<acs 
P{(0, 0)} = 1 — A(O)[1 + 7H’(0)]. 
Define the process {Z,,0 S t S r} on (Q, B, P) by 
Z.(0,5) = f(t)dexp[— |t — 6|H’(0)). 
Then E(Z,) = 0 by symmetry, and 
hz(A) = $h(0){exp[— |t — 0|H’(0) — |t + A — O|H’(0)] 
+ exp [— |t — 7|H’(0) — |t + A — 7|H’(0)}} 
+ foexp[— |t — 6|H’(0)] exp[— |t + A — 6|H’(0)]h(0)H'(0) do 
= h(0) exp {— AH’(0)}[1 + AH’(0)]. 


Thus h2(0) = h(0). Direct computation of Hz(0) = [— h2(0)/h(0)]}' yields 
H7z(0) = H’(0). Thus the process {Z,, 0 S t < 1} satisfies the conditions of 
Corollary 6.2. Since 


, 


P sup a = y = P| (0,6):0<@<s 7} =h(0)[1+7H'(0)], 


te[0.7r] 


the proof is complete. 
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In order to apply (6.5), one does not need to know the function h but only 
h(0) and H’(0); presumably a better bound could be given if h were known. 
The preceding example shows that if h is of the form h(A) = o (1 + ad)e** 
(a 2 0, A 2 O), then no such improvement is possible. 


Acknowledgments. The authors are most grateful to Robert M. Blumenthal 
and to the referee for a number of valuable suggestions and comments. 
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MAXIMAL INDEPENDENT STOCHASTIC PROCESSES! 
By C. B. Bri’ 
San Diego State College and University of California, Berkeley 


Introduction and summary. This paper concerns the following problem posed 
by R. Pyke (1958). What is the cardinality, M,, of the maximal family of 
stochastically independent random variables defined on a given space Q, of 
cardinality Q = k? 

Since maximality is sought, the investigation is limited to two-valued, non- 
trivial (tvnt) random variables; and the c-algebra of measurable subsets of Q 
is taken to be that generated by the family of random variables. With these 
restrictions the problem is essentially one of cardinality. 

The results are summarized in the table below. 


Theorem Number 


Cardinality, k, of space 
Cardinality, M;, of a maximal tvnt 
family..... se a al ae [logs k] 


Theorem 1 follows from the fact that stochastic independence entails the non- 
vanishing of certain finite intersections of elementary sets. Theorem 2 is a result 
of Kakutani, Kodaira and Oxtoby [4, 5, 6]. Theorem 3 is a consequence of a 
set theoretic result of Tarski [11], and a theorem of Banach [1, 2, 10, 11], which 
results were used in proofs of Theorem 2. Theorem 4 follows from a construction 
and a lemma of Marczewski [9]. 

The paper is divided into five sections. Section 1 intruduces the notation and 
terminology. Section 2 discusses two types of independence. Sections 3, 4, and 
5 treat, respectively, the finite, non-countable and countable cases. 


1. Terminology and notation. Let 2 be an arbitrary fixed abstract space. 
“Q” will denote the cardinality of 2; and “@’’ will denote the empty subset of ©. 

If A is an arbitrary subset of 2, “A°” will denote the complement, 2 — A, 
of A; and “A’” will sometimes be used synonymously with A for notational 
convenience. 

A “o-algebra,” 8, of subsets of © is a class containing 2 and closed under count- 
able unions and complementation. A ‘‘probability measure”’ P on §$ is a countably 
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additive, non-negative, real-valued set function such that P(Q) = 1; and the 
triplet (Q, 8, P) is called a ‘‘probability measure space.” 

Since maximality is of prime interest here, it is desirable to restrict considera- 
tions to that of two-valued, non-trivial (tvnt) random variables {X,, te T}, 
for which the following definitions and restrictions are natural. (a) “A,” = 
(X: = lj; “AS” = {X, = O}; “p.” = P(A:) = i = “] — mn”; 0<p<l 
for allt « T, and (b) $ = S({A,, te T}), ie., the least o-algebra with respect 
to which all the X, are measurable. 

A “tvnt random variable,” is then a point function on © satisfying (a) and 
(b). In the sequel only tvnt random variables will be considered. 


2. Two types of independence. A family {X, , te T} of tvnt random variables 
on @ will be said to be “stochastically independent w.r.t. (with respect to) a 
collection {p;, ¢ ¢ T}” of probabilities if there exists on $ = S({A:, te T}) a 
probability measure P such that P(f\%-1 Ai”) = Ut. pi® for each finite sub- 
class {A;z,,---, Ae,} of $ and each sequence {7,} of 0’s and 1’s, P is called the 
“stochastic extension”’ of the {p;}. 

A closely related type of independence which will be useful is that of set inde- 
pendence. 

The { X,} are said to be “o-independent”’ if for each at-most-countable subclass 
N = {h, te, --+} of T witht; ¥ t; fori # j and each sequence {7,} of 0’s and 1’s, 
Ni A ¥ @. 

Finally, the {X,} are said to be “finitely independent”’ if every finite intersec- 
tion ta An ¥ O. 

Relations between the types of independence form the basis for the proofs of 
Theorems 1, 2, and 3. The fundamental result in this direction [1, 2, 8, 10, 11] is 

Lemma 1. If {X,} is a o-independent family, then for arbitrary given probabilities 
{pi, {Xj is stochastically independent w.r.t. the {pi}. 

Now, since any set of positive measure is non-empty; and since, when 0 < 
p: < lforallt ¢e T, each [[v-: pit ¥ 0, the following partial converse of Lemma 
1 is valid. 

Lemma 2. If a family |X,} is stochastically independent w.r.t. some given {p;} 
(0 < p, < 1), then the |X,} are finitely independent. 

Since for a finite family {X,}o-independence and finite independence are 
equivalent, one can use Lemmas 1 and 2 and an elementary algebraic formula 
to find the maximum cardinality M, for finite k. 


3. Finite spaces. If @ has finite cardinality, then any family {X,} of inde- 
pendent random variables on © is necessarily finite. But for any finite class 
{A,, Ao, «+: , Aa} there are exactly 2° sets of the formf)%.: A;*, which sets are 
mutually disjoint and non-empty whenever the {X,} are finitely independent. 

However, as previously mentioned, for the finite family {X,} finite independ- 
ence is equivalent to c-independence. Hence Lemmas 1 and 2 and an elementary 
construction yield, 

Turorem 1. Jf Q = k < No, then 
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(a) for each m & log: k, there exists a family of m tvnt stochastically independent 
random variables on Q; 

(8) if m > logs k, there exists no such family; and consequently 

(vy) M, = [loge k] for k << BM. 

Since the proofs for non-countable spaces also make use of the relationships 
between the types of independence, it is feasible to treat them next. 


4. Non-countable spaces. An indirect result of the Kakutani-Kodaira-Oxtoby 
non-separable extension of Lebesgue measure [4, 5, 6] is the following. 

Tuzorem 2. If 2 = @, the power of the continuum, then there exists a family 
of 2° tvnt stochastically independent random variables on Q; and, therefore, Me = 2°. 

The proof of Theorem 2 is based on a lemma of Tarski ({12], Hilfsatz 3.16, 
p. 61), which lemma can be used to obtain the solution for a special class of 
cardinal numbers. 

TARSKI’s LEMMA. [f Q = k® = k = BN (where kZ = Daven k"), then there 
exists a class, U of subsets of 2 such that 

(1) & = 2"; and 

(2) for each pair £ and M of disjoint subclasses (of U) with cardinalities less 


than m, 
Os] ¢ [Ye 


Be 


In view of Lemma 1 the interest here lies in a formulation of Tarski’s Lemma 
in terms of o-independence. It is readily established that 
Lemma 3. If Q = k® = k = Np for some m > N, then there exists a family 


{X,, te T} of two-valued random variables on Q such that 

(1) T = 2, ice., the family is of maximal cardinality; and 

(2) the family {X,} is o-independent. 

The maximal independence theorem now follows from Lemmas | and 3, and 
the fact that k=! = k™* for k = 2. 

THeorEeM 3. Jf Q =k = bY > &, , then there exists a family of 2" tvnt sto- 
chastically independent random variables on Q. Consequently, M, = 7. 

At this point one notices that although the proofs have some aspects in com- 
mon the results for k < No and k > N» are quite different in nature. For the 
case k = &), not only does the result differ from the two preceding results, but 
also the nature of the proof is different. In this last case one employs the results 
of Marczewski [9] concerning purely atomic measures. 


5. Countable spaces. If (2,8, P) is a probability measure space, then B « § 
is said to be an atom of P if (a) P(B) > Oand, (8) whenever B > E ¢8, P(E) = 
0 or P(E) = P(B). 

Further, if © is the union of atoms, P is called purely atomic. 

Marczewski [9] has essentially proved that 

Lemma 4. If {X,} is countable family of tvnt random variables stochastically 
independent w.r.t. {pn}, then the stochastic extension P is purely atomic if and only 
if 02-1 min (p,, 1 — pa) < @. 
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Now, if {X,,¢t ¢ 7} isa family of stochastically independent random variables 
on a countable space Q, then clearly 
(i) each probability measure P on § is purely atomic; 
(ii) for each countable N C 7, >oiew min (p,, 1 — pr) < ©; and, therefore, 
(iii) at most countably many of the probabilities {min (p,;, 1 — p,), t € T} 
are non-zero. 
Consequently, one concludes 
Lemma 5. Any family of tvnt stochastically independent random variables on a 
countable space 2 is at most countable. 
In order to complete the solution it is sufficient to construct a countable tvnt 
family on an arbitrary countable space Q. 
In his constructive proof of the necessity of Lemma 4, Marczewski [9] demon- 
strates essentially that 
Lema 6. There exists a countable set {p,} of probabilities; a space Y; and a 
countable family |Z,} of tvnt random variables on Y (with B, = |Z, = 1} and 
pn = P(B,)) such that 
(a) the {Z,} are stochastically independent w.r.t. {pn} ; 
(8) 0 < pp S 3 for all n; and od Pn < ©; 
(y) the stochastic extension, P, is purely atomic; and 
(5) the atoms of P are exactly those sets of the form (\2-1 Bi? where > 24 in < @. 
The space Y above is not necessarily countable. However, it is purely atomic 
and has countably many atoms. Hence, there exists a 1 — 1 mapping, ¢, of 
the atoms of P onto the single points of any given countable space 2. The natu- 
rally induced measure; o-algebra; and random variables on 2 constitute the de- 
sired construction. (A construction with a Markov process is given by Blackwell 
(13}.) 
One, therefore, concludes 
THEOREM 4. If Q is an arbitrary countable space, 
(a) there exists a countable family of stochastically independent tvnt random 
variables on Q; 
(b) there exists no non-countable family of stochastically independent tvnt 
random variables on 2; and, hence, 
(c) Mx, = No. 
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ON MARKOV CHAIN POTENTIALS! 
By Joun G. Kemeny AnD J. Laurte SNELL 


Dartmouth College 


1. Introduction. In [3] we developed a theory of potentials for denumerable 
Markov Chains. The purpose of this note is to supplement these results in two 
ways: We will show for an important special class of Markov chains that they 
are normal (i.e., that the potential operators exist), and we will generalize cer- 
tain results due to Spitzer [5]. 

While our previous paper developed a theory both for transient and for re- 
current chains, our present note will deal only with the recurrent case. The key 
definitions, notations, and theorems for this type of chain will be summarized 
below. Parenthetical references to theorems will always refer to [3], Section 3. 

We consider both measures (row vectors) and functions (column vectors) ; 
the former are denoted by Greek letters, the latter by ordinary lower case letters. 
The theory for functions is dual to that for measures. One passes from one to 
the other by replacing a transition matrix {P;,;} by the “reverse chain” {a;P ;;/a;}, 
where a > O and aP = a. 

If the limit v = lim, [u(J + P + --- + P”")] exists, we say that v is a poten- 
tial, and yu is its charge. The set of states for which y,; is non-zero is the support 
of the charge. If 1 is the constant function, and if ul is defined, then ul = 0. 
Dually, one defines potential functions. If the column vector f is a charge of a 
potential function, and af is finite, then af = 0. 

Let N$}? be the mean of the number of times that the process is in state j in 
the first n steps, starting at 7. If lim, [N$7? — N$}?] = Ci; = 0 exists for all i 
and j, we say that the chain is normal. Under certain assumptions, if » exists 
then » = —yzC. A sufficient condition is that » be a weak charge, i.e., that not 
only uC is finite but also Cf, where f; = u;/a; is the dual charge. (See Theorem 
15.) For example, all charges of finite support are weak. The dual operator 
Gi; = lim, (N$??-a;/a; — N$}?] serves a similar role for functions. All ergodic 
(positive recurrent) chains are normal, and the finiteness of uC suffices to assure 
the existence of the potential. 

Many of our considerations will be relative to a given set of states Z. Then 
B?, is the probability of entering E at j, starting at i. “N;, is the mean number 
of times in j, starting at 7, before hitting E—this is taken to be 0 if 7 or 7 is in 
E, and we write “N,, if E = {k}. By P?, we mean the probability that starting 
from i in E we reenter E at j; P” is itself a recurrent transition matrix, for the 


states in E. The submatrix of C consisting of rows and columns in E is denoted 
by Cs « 


Received November 4, 1960; revised November 22, 1960. 
1 This research was supported by the National Science Foundation through a grant given 
to the Dartmouth Mathematics Projects. 
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Of special interest are the limits lim, P”B’, giving the entrance probabilities 
“in the long run”’. If these limits exist, and they always do for a normal chain, 
then the limits are independent of the starting states, hence the limiting matrix 
has identical rows \”. (See Theorems 16, 20.) The existence of these limits for 
two-point sets is equivalent to normalcy. (See Theorem 14.) Similarly, we de- 
fine *» to be the common row of the limiting matrix P”(*N). 

A chain is ergodic if the mean first passage times M,; are finite, and strong 
ergodic if the passage times “in equilibrium,” aM, are also finite. 

Spitzer considered recurrent Markov chains obtained from sums of independent 
random variables with a common distribution. He assumed that this distribution 
was a two dimensional symmetric distribution. He showed that these chains are 
normal. In the first part of this paper we establish this result for the not neces- 
sarily symmetric one dimensional case under the assumption of a finite variance. 

For transient chains the basic potential operator is N = 7+ P+ P?+---. 
In this case for any set E, Nz’ exists and 


I-P*=N;'. 


Spitzer showed that for the recurrent chains that he considered, and any finite 
set E, Cz’ exists. He established an elegant formula for J — P”, using this in- 
verse. In the second part of this paper we give necessary and sufficient conditions 
that C3’ exist for finite sets, for any normal recurrent chain. In particular this 
inverse exists for all finite sets for certain symmetric chains and for ergodic chains. 
We obtain a generalization of the Spitzer formula (corollary to Theorem 2) for all 
such chains. We use these results to shed new light on certain previous results of 


ours. 


2. A class of normal chains. Let {p;} be a probability distribution on the 
positive and negative integers. We consider the Markov chain having transition 
probabilities P;; = P;;. Assume that this chain is started in state 0. Then the 
resulting random variables Sy), S;, --- represent sums of independent random 
variables with a common distribution. We assume that S, has finite variance 
o. The mean must be 0 for the chain to be recurrent. We assume that this is 
the case. We are interested first in studying BY; for a finite set E. For conveni- 
ence we assume that the smallest element of £ is 0. 

Lema 1. lim,.,.. BX = Bt and lim,._.. BY; = Bz exist. 

Proor. We shall prove that lim,._.. B?; exists. We refer to the process {S;} 
determined by P as the basic process, and define an auxiliary process called the 
ladder process as follows: Let Sy) = So = r. We define 8,4; to be the first state 
>S, reached by the basic process. We thus obtain the ladder process which 
represents the progress of the basic process watched only when it makes progress 
to the right. This ladder process is again a Markov chain with transition proba- 
bilities given by P;; = #;-;, equal to the probability that the basic process 
started in i reaches a state >i for the first time at state j. Let 1» = Mo{S,], the 
mean of §, if the process starts at 0. This mean is finite by a theorem of Spitzer 
[6]. He proved 


w= (0/r/2)-c 
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where 


0 <c = exp {> (1/k)(} — Pr (Si > oy} ~ @, 


We denote by BY, the probablity that the ladder process started at r reaches 
the set F of all non-negative integers for the first time at s. Then 


BY, = Pr, [S, = s for some n and S,, < 0 for m < nj 


(1) = Pr, [S, = s for some n] 


— > Pr, [S,.. = s — k and §, = s for some n]. 


k=1 
By the renewal theorem, 
(2) lim,._. Pr, (S, = s for some n] = 1/y. 


Let B? = lim,._.. BY, . By (1) and (2) this limit exists and 


BY = 1/u- (1 - > #,) = Val 
j=l j 


Note that >>, B? = 1. Now 

(3) Be = 2. BEBe . 

Since B¥; < 1, and since >», BY, = >>, BY = 1, we have 
Bz = lim,.. Bi = 20, BIB. 


The proof for B7 is similar. 
THEorReM 1. Assume that {p;} has mean 0 and finite variances o°. Then 
lim, P"B* = 1-* 
where \¥ = 4Bi + 4B7. 


Proor. Let 1* be a column vector with 1 for the non-negative states and 0 
otherwise. By the Central Limit Theorem 


limns. (P"1*)» = 


For a null chain, the probability of being in any finite set at time n tends to 0. 
Hence we have 


lim, P"1* = 4-1. 


For fixed i, let f be a column with value B] on the non-negative states and By 
on the negative states. Then 


lim... P"f = (4Bt + 4B7)-1 


Let g be the ith column of B*. Then f — g has limit 0 at + and —. Since 
Pp" — 0, it is an easy consequence that 


lim... P*(f — g) = 0. 
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Thus 
lim P"g = (4By + 4B7)-1 
as was to be proved. 


This theorem shows that sums of independent random variables with mean 0 
and finite variance always constitute a normal null chain. 


3. Restricted potential operators. From now on we assume that we have an 
arbitrary normal chain. Quantities for the reverse chain will be denoted by ° 
Thus } in the following theorem is \ computed for the reverse chain. 

We will assume from here on that £ is a finite set of at least 2 states. 

TueoreM 2. G;(I — P”) = —I + 1\°; (J — P*)Ce = —I + laz, where 
i= Ni /aj . 

Proor. We have shown (Theorem 20) that if we choose a column of J — P” 
as charge on a set E, then this is always a weak charge, and the resulting 


potential is the corresponding column of B” — 1\*. Hence, —G (’ ee 


= B* — 1)’, where the right side has rows corresponding to all states, but 
columns corresponding only to the states in E. We obtain the first result by 
restricting the rows to those in EF, while the second result is obtained by dual- 
ity, i.e., by applying the first result to the reverse chain. (The vector / is the 
dual of A, and a@ is the dual of the constant vector 1.) 

Corotuary. If Cy’ exists, then (I — P*) = —Cy' + lasC7’). 

If Gz’ exists, then (I — P*) = —Gy + (Gz'1)n’. 

THEorEM 3. There is a measure w such that Ce(I — P*) = —I + Iw. 

Proor. (CgP") i; = Dcies lim, [Ni — N§&P]-PE;. We may interchange the 
limit with the summation. 


(CeP*) sj = lim [ 2) NED’Pey — (NSP — dy + BUI YT, 
n ek 


= (;; + 6; — 7 + lim [> NE PE, — Nii’ }. 


keE 


We let w; be \} minus the quantity in brackets, and the theorem follows. 

Corouuary. If v = —yC is a potential, then ve(I — P”) = uz. 

This result is immediate from the theorem and from the fact that charges 
have total measure 0. We obtain an obvious dual result for potential functions. 

THEOREM 4. C3’ exists if and only if Cl ¥ 0. If the inverse exists, then Cz'1 = cl, 
where c = ag; 1 is a positive constant. 

Proor. For any finite set, agl = > -iex \7 = 1 (by a result in the previous 
paper). Hence | # 0, and thus if Cz is non-singular, Czl ¥ 0. 

Conversely, suppose that Csl ~ 0. We compute Ce(I — P”)Cx twice, once 
from each of the last two theorems. We obtain: 


—Cer+ (Cellar = —C;+ 1(wC sz). 


If Cyl ~ 0, then 1 = cCyl and ag = cwC, , for some constant c, which is clearly 
positive. 
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Suppose that for some measure x, rCg = 0. Then 71 = 0 from the above. 
And multiplying the result of Theorem 3 by z we find that 0 = —x + (r1)w = 
—x. Hence x = 0, and thus C; is non-singular. Therefore, Cz'1 = cl, and if we 
multiply by ag we obtain the value of c. This completes the proof. 

The dual of this result is that Gz is non-singular if and only if \“G, # 0, and 
then agGz' = éd”, where é = azGz'1. 

The Corollary to Theorem 2 provides the desired generalization of Spitzer’s 
result, if we know that the inverses exist. His result was applicable to symmetric 
sums of independent random variables. We see more generally: 

TueoreM 5. If P is either ergodic, or symmetric and P}; = Pj; for all i, j, n, 
then Cy’ and G5’ exist for all finite sets E. And for any chain, if * > 0, then Gz’ 
exists, while if \” > 0, then Cy’ exists. 

Proor. We shall show the results for Cz, the others are dual. If Cel = 0, 
then Cz must have a 0 ith column whenever \f > 0. Hence \” > 0 would re- 
quire Cy = 0, which contradicts Theorem 2. For an ergodic chain C;; = M;,,;a; 
(see Theorem 24, Corollary 1), hence all off-diagonal components of C are posi- 
tive. Furthermore, we know that Cjja;/a; + Cj; = Ni: = 1if i ¥ j (see Theorem 
22, Corollary 1). If P is symmetric and P?; = P}; , then a; = a; and C;; = Cj; . 
Hence for 7 # 7, C;; 2 4. And this completes the proof. 

The significance of these results is that if Cy is non-singular, then it, together 
with ag , determines P*. This is seen from the Corollary to Theorem 2 and from 
Theorem 4. This is a generalization of the result in [2], that the transition matrix 
of a finite chain is determined by M and a. 

It is easy to construct examples where C;; = 0 if 7 = 7, and hence where Cz 
is singular for all E. The class of examples in [3] has this property in all null- 
recurrent cases. More generally, if in a null chain M;; happens to be finite, then 
C;; = 0; hence random walk in one dimension with a reflecting barrier is an- 
other example. 

LEMMA 2. 


: Bar(n) y(n) Exr 
Lim, [2 BENS} — N§ ] = dia; — "Ni;, 


kek 


where d; = *%;/a 
Proor. If i ¢ E, then B%, = 5% , and both sides are 0. 
For iz E we will show this result for the reverse chain. Using that BA, = 
a,” Ni;/a;, we find that for the reverse chain our assertion is equivalent to 
lim [2 NSP FN — Ni] = *y, — "Nii. 
n kek 
In this form the result can be proven by the type of systems-theorem argument 
we used repeatedly in our previous paper (see the explanation preceding Lemma 
6). Here “N;,; is the mean number of entries from k into 7 before returning to E. 
TueoreM 6. If f is a charge with support in the finite set E, then Cf = Gf + 
(\*Cefz)1. 
Proor. We will show first that the relation holds on E. Then we will show 
that B*(Cf) = Cf. The result will then follow, since B“(Gf) = Gf (see Theorem 
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11), and B’1 = 1. If we compute G,(J — P*)C;, in two ways from Theorem 2, 
we find Cy + (Ggl)az = Gg + 1-(d*Cz). We multiply this on the right by fe , 
and use the fact that azgfs = 0 for a charge with support in E. We obtain the 
desired result inside E. Since E is finite, 

(B*C),; = lim 2 Bi. INS}? — NEP) = Ci; — lim [> BEN? — i}. 

n ek n ek 
Thus, by the lemma, B*C = C + *N — da. But “Nf = 0, since f has its sup- 
port in E, and af = 0 for a charge, hence B*Cf = Cf. Which concludes the proof. 

This shows that for charges with finite support C not only serves as potential 
operator for measures, but it “almost” serves for functions as well. 

If we are dealing with a finite Markov chain, then we may choose E to be 
the set of all states. Then A” = \” = a, hence C1 = (1/c)1. This says that 
the row-sums of C, hence of {M;j;a;} (see Theorem 24) are constant. This we 
proved independently in [2], and there identified the sum as the trace of Z — A = 
(A — I)C. (See p. 81 and Theorem 31. There is a discrepancy of 1 due to a 
difference in the definition of M.) 


4. Interpretation of results. Our results are more easily interpreted for ergodic 
chains. Here the existence of Cz’ is equivalent to the existence of Mz’. Thus 
we see that for ergodic chains this inverse exists for every finite set of states. 
This is a generalization of the existence of M~ for finite chains, which we ob- 
tained in [2]. 

Using the fact the 1; = \f/a; = Mix, the mean time to return to set E from i 
(see Theorem 27), we see that the ith row-sum of C3’ is cM iz. We also note 


that c is the sum of all the components of M3’, and hence \” is obtained by 
normalizing the row-sums of M3’. 

It is also worth noting that for ergodic chains Theorem 2 is equivalent to the 
assertion that the mean time from 7 in E to reach a state j in E is the mean 
time to return to E plus the mean time once E is reached of hitting 7. 

To obtain an interpretation of one more result we will specialize to strong 
ergodic chains. 


THEOREM 7. For a strong ergodic chain, 
(aM); + Mi; = (aM); + My. 
Proor. From the relation between Z and C we obtain (see Theorem 31), 
Ci; = (aC); + a;/afCj; — (aC))). 


We then replace C;; by M;,;a; , and make use of the fact that aM is the same 
for the reverse chain as for the original. (See Theorem 24 and Lemma 12.) 

This result is interesting in itself. It says that “the time to reach j in equi- 
librium via 7” is the same as “the time to reach 7 in equilibrium via j” for the 
reverse chain. But we can also use it to clarify a previous result, Cel = (1/c)1. 


(Cel); = 7. Cos Qa = >: Mart = z Mi(aM), a M,.. — (aM). 
kek 


kek keE 
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The last expression is entirely in terms of the reverse chain. For this chain the 
last two terms yield >>;-2 ¢Mi; — (aM),;. This has a simple probabilistic in- 
terpretation. The second term is the time to reach 7 in equilibrium, while in the 
first term we start in equilibrium and count the time to reach i after entering E. 
Hence the difference is —M,,, the negative of the time to reach E in equi- 
librium. This explains why the sum is a constant, and we obtain that 1/é = 
2 M(aM), — Mae. Since (aM), = Mag, we see that c is positive. 
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MARKOV CHAINS WITH ABSORBING STATES: A GENETIC EXAMPLE! 


By G. A. WaTTERSON 


Virginia Polytechnic Institute 


1. Summary and introduction. If a finite Markov chain (discrete time, discrete 
states) has a number of absorbing states, one of these will eventually be reached. 
In this paper are given theoretical formulae for the probability distribution, its 
generating function and moments of the time taken to first reach an absorbing 
state, and these formulae are applied to an example taken from genetics. 

While first passage time problems and their solutions are known for a wide 
variety of Markov chain processes (e.g., [2], [7], [4]), the theory seems not to 
have been used in population genetics. Suppose a genetic population consists of a 
constant number of individuals and the state of the population is defined by 
the numbers of the various genotypes existing at a given time. Then if mutation 
is absent, all individuals will eventually become of the same genotype because 
of random influences such as births, deaths, mating, selection, chromosome 
breakages and recombinations. The population behavior may in some circum- 
stances be approximated by a Markov chain with absorbing states. 

In Section 2, two alternative approaches are given for the theoretical deter- 
mination of absorption time properties, using well known techniques. In Section 
3, the consequences of the theoretical results are investigated for a particular 
population model introduced by Moran [9], [10], and explicit expressions for 
the distribution of the gene fixation time are obtained in terms of Chebyshev’s 
orthogonal polynomials. The derivation requires finding the pre- and post- 
eigenvectors of the matrix of transition probabilities, and an incidental by- 
product is the proof of certain identities for the orthogonal polynomials. 

The material presented in Section 2 and Section 3 is obtained by exact methods. 
In Section 4, the Fokker-Planck diffusion equation is used to obtain approxi- 
mate results, and these are compared with those of the exact theory to ascertain 
the accuracy of the diffusion approximation. 


2. Markov chains with absorbing states. 
(a) Arbitrary initial state. Consider a Markov chain with variable x(t), which 
at time ¢t (¢ = 0, 1, 2, ---) can be in any of the states 0, 1, 2, --- , M. Let 
(1) Py; = Priz = j|2(7r — 1) 1} 
be the unit-time transition probabilities, and write 
(2) Pi? = Pr {a(t + r) =jla(r) = 4}, t,7 = 0,1,2,--- 


rhen, if P is the matrix of elements P,; , the elements of P’ are the t-step transi- 
tion probabilities (2). 


Received September 7, 1960; revised February 13, 1961. 
' The major portion of this work was completed while the author was a research scholar 
at the Australian National University. 
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We assume that the states 0 and M are absorbing, and that the states 1, 2, 
->+, M — 1 are transient. The following theory could be adapted to the case 
with more (or fewer) absorbing states, but the application to genetics makes 
the specific case important. Therefore we have 

Pw = 
(3) 


Po — Pe vee A cy ee Pou = Puy = Pm oe. ee Pum = 0. 


Let 7; be the time taken for the chain to first reach one or other absorbing 
state, given the initial state x(0) = i, and write S;°” for the probability that 
T; = t. Clearly 


(4) Si? = Pio + Pin — Pio” — Pin’, t= 1,2,-:- 
but in particular, 
(5) So” = Sw’ =1, So” = Sv = 0, 
and fori ~ Oor M, 
(6) Si =0, Si? = Pot Pin. 
If we write S‘” as the column vector whose transpose is 
S‘°" = (89°, Si”, ---, SY), 


and, in particular, from (5) and (6), 


s® = (1,0,0,---,0, 1), 
gs” = (0, Pi + Pin, Poo + Pom, +++, Pusaot Pu-is , 8. 
then (4) becomes 


(8) s” = (P* — P*")s® = P*'S", t= 1,2,3,---. 


(7) 


The calculation of absorption probabilities by (8) will generally be a diffi 
cult task unless the eigenvalues and eigenvectors of P are known. If they are, 
however, we can proceed as follows. Let A; be the jth eigenvalue of P, and 
K; the corresponding post-eigenvector. Then 


PK; = KA,, j= 0,1,---,M; 


that is PK = KA where K = (Ky, K,,--:, Kw), A = (8A;). While the 
columns of K are the post-eigenvectors, the rows of K™ are the pre-eigenvectors, 
and we have P = KAK™' or, more generally, 


(9) p*' = KA‘ 'K", t= 1,2,--- 


’ 


where A‘” is a diagonal matrix with elements \j'(j = 0, 1, ---, M) in the 
diagonal. Thus substituting (9) into (8) gives 


(10) s a ca x oe. 


At least in theory (10) gives the distributions we seek. 
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(b) Transient initial state. An alternative approach can be made by assuming 
that x(0) = 7 is not absorbing. Write P,? for the matrix obtained by ignoring 
the first and last rows and columns of P; P, is not stochastic since some non- 
zero elements have been removed from the stochastic matrix P. Further, with 
the notation 


Si" = (S{°, Si, --», Sida), 
we find from (6) that 
ss” = (0,0, ---,0), 
so” = (Pr + Pinu, Po + Paw, -->,Pusre+ Pus); 


that is, SS = (I, — P,)1,, where I, is the unit matrix, 1), = (1, 1, 1,---, 1), 
both of order M — 1. 
Corresponding to (8) we then have 


(12) Sy” = PLS.” = Px'(I, — Pa)dy. 


(11) 


From this, an equation analogous to (10) could be written down, but we will 
not require it in the sequel. 
For the probability generating function, we write 


o oo 
ter (t) tes (@) 
G,(z) = 22'S,” = D0 2'S,”, 
t=0 


t=] 


which by (12) is 


G.(z) = 2(1, — P,)1, + 2), 2" 'S,(t) 


t=2 


2(I, — P,)1, + 2), (2P,)* (I, — P,)1, 
¢=2 


= 2(1, — P,)1, + 2P,(I, — 2P,) (I, — Ps), 
= (z ‘I, — Py) “(I, — P,)1,. 


Although (13) does not involve knowledge of the eigenvectors of P, the resolvent 
(z ‘I, — P,)' must be known for this to be a useful alternative. In the ex- 
ample of Section 3, we shall meet a case where the resolvent is known at z = 1 
and this is sufficient to calculate moments. 

In [5], Karlin and McGregor have discussed the problem of random walks 
with “ignored” absorbing states. The transition matrix for the transient states, 
our P, above, is assumed to have the Jacobi form with 


(14) P;; = Ofor ji — j| > 1. 


2In what follows, the subscript , will distinguish (M—1)-order matrices and vectors, 
got by deleting the elements for states 0 and M, from the corresponding (M+1)-order 
quantities. 
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Our genetics example in Section 3 is of this type, but does not appear to yield to 
their methods. 


3. Application to a genetic population model. 

(a) The model. We consider a population model in which there are M indi- 
viduals, each being one or other of two haploid genotypes. The birth-death 
model postulates that at each unit of time, one individual is chosen at random 
to die, and is replaced by a new individual whose genotype is determined at 
random from those existing before the death. Thus the number of individuals of a 
given genotype—the state of the population—can take any of the values 0, 1, 
2,---, M, and can change by at most unity during one birth-death event. This 
model was introduced by Moran [9], and further discussed by him, [10]. Actually, 
Moran considered as well the more general case when gene mutation was al- 
lowed. Here, as mutation is assumed absent, there is no source for new genes, 
and once all individuals are of the same genotype the population state remains 
unchanged thereafter. 

The transition probabilities are (see [9] with a new notation) 


P;; = 0, if |¢ — j| > 1, 
Pin = 1(M oi 1)M 7 
Py = 1 — 2i(M —i)M” 
Piss = (M — i)M™. 
The states 0 and M are absorbing, those in-between are transient. 
(b) Known results. Hannan, in an appendix to [9], has proved the following 
theorem, expressed here in our notation. 
TuHroreM 1. Transforming the matrix P of (15) by the matrix R, where R has 


the typical element R;; = (‘) and R™ has the typical element (—1)‘*? (‘), i,j = 


1 . . . 
0, 1,---, M, then R- PR has non-zero terms only in the leading and first super 
diagonals. The ith row is 


(16) (0,0,---,0,1 —a(i — 1)M™*, —i(M — i)M”’, 0,0, --- , 0), 


the quantity 1 — i(i — 1)M™ in the diagonal yosition is the ith eigenvalue of P. 
Moran [10] stated the following results, again expressed here in our notation. 
THroreM 2. Jf K = (Ky, Ki,---, Ka) is a matrix of eigenvectors K; = 

(Ko;, Kij, +++ , Kj) such that 


K PK = A = (8,1 — i(¢ — 1)M™)), 


then, apart from normalizing constants, 
(i) Ke =1, Ka =1t, Ke = i(M —i7), Kis = 1(M —i)(M — 2%), 
Ku = i(M — i)(M* — 5Mi+ 5° + 1),i =0,1,---, M. 
(ii) The jth element of the ith prevector, K“’ say, is proportional to 
j'(M -—j)"Kx, g=1,2,---.M-—1, i=0,1,---,M. 
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(iii) The generating function >>}. Kz’ satisfies a particular case of Heun’s 
differential equation. 

(c) New results. In Theorem 2, Moran has given some of the early eigenvectors 
of the matrix P. Theorem 1, however, can be used to obtain an explicit ex- 
pression for all the post-eigenvectors. Suppose that W is a matrix such that 


R'PRW = WA. 


Then RW has columns which are the required post-eigenvectors. By using (16) 
and the known eigenvalues, it is seen that the elements of W must satisfy the 
difference equation 


i(M — i)Wi 5 = GG — 1) — i — 1). 
One solution is 
Wo = l, W io = 0, i= l, 2, a eeng M, 


and for 7 = 1, 
i—1 
Wo; = 0, Wi; = 1, Wy = [] GG -—1) —k(k — D)]k (MM — ky", 
k=1 


2.3,--+, MM. 


Hence we have 


> 


THEOREM 3. The post-eigenvectors K; of P have elements K;; proportional to 


M i . 
iz RiuWi; = Zz (1) Wj . 
k=0 k 


k=0 


This result would be sufficient to obtain explicit expressions for the probabil- 
ity of first absorption at time t. However, the resulting expressions are rather 
complicated ; luckily a different approach leads to tractible algebra. Consider the 
Chebyshev orthogonal polynomials defined by 


(17) wat(2\(7—™), fet i... BSy 


where 


Af(x) = f(z +1) — f(z), 
A’f(x) = . (-1)' (;.)jce +j-k), 


ce; = j\(2j + 1){M(M? — 1°)(M? — 2’) --- (M? — fy". 


Then £;(x) is a polynomial of degree j in x, and the set is orthogonal in the sense 
that 


M—1 
(18) > &(k)é(k) = 6;, 1,7 = 0,1,2,---,M —1, 


k=0 
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see [1], p. 223. In what follows it will be convenient to use the conventions 

(19) §,(—1) = 0, j3=0,1,---,M-—1 
and to introduce the new function 

0 if k=0,1,---,M —1, 

1 if k= —1, 


(20) 


From (18), (19) and (20), it is clear that the augmented set {&;(k)} 7 = —1 
0, 1,---, M — 1 is orthogonal over k = —1,0,1,---, M —1. 

The non-trivial values of §;(k) have been tabulated in [11] for M = 3(1)52, 
j = 1(1)6, and references are given there to more extensive tabulations; for 
each j, the values of £;(k) are multiplied by the smallest constant to make the 
tabulated entries integers. 


’ 


The functions £;(z) satisfy the difference equation 
(x + 2)(a — M + 2)A°E(x) + [22 — M + 3 — j(j + 1)]4E,(z) 
— j(j + 1)E(x) = 0, 
see [1], p. 223, and this may be written 
HI + UYEC(a@ +1) = Al(a + 1)(a — M + 1)AE;(z)]. 


Summing over the integers, and renaming the variables, we get 


(jg — 1)7>0 E(k — 1) = 1(M — a)fejalé — 1) — B44), 
k—0 


(21) 
i,j = 0,1,--- 
where the conventions (19) and (20) have been used. 


We are now in a position to prove 
TueoreM 4. For the matrix P defined in (15), 
(i) The eigenvalues are 
(22) 4j= 1-77 —1)M, j 
(ii) The post-eigenvectors are the columns K; of the matrix 
(23) K = (K,, K,,-°-- , Kw) = Ce, 


where 
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and & has £;,(4 — 1) in the (7,7) position, i,j = 0,1,---, M. 
(iii) The pre-eigenvectors are the rows of the matrix 


(24) K'=s8'C", 


where B’ is the transpose of B, and 


Proor: Parts (ii) and (iii) of the theorem are either true or false together, 
because & is an orthogonal matrix, C™' has the stated form, and the inverse of 
the post-eigenvector matrix gives the pre-vectors. It will therefore be sufficient 
to prove that (i) and (ii) are correct, and this is done by proving 

PK = KA 
for the particular definitions used here. Write g;; and h,; for the typical elements 
of the left- and right-hand sides respectively; then we have to show that g;; = 
h,; for i,7 = 0, 1,---, M. 
Multiplying out PK = PC& we find 


M M 
om edt vera 
—0 n=k 


and with the substitution for P;, from (15) we get 


i i i i 
Jij = 2 fi (k — 1) + E ae ¢ = i) &i(¢ — 1) + iA = teal, 


Again, multiplying out KA = CBA we have 


his = 4D & alk — 1) = E ~ WG = 1) |Z ene =‘}). 
k=0 


M 


k=0 


The equality of g;; and h,; follows from (21), and holds for all relevant 7, j. 
This completes the proof of the theorem. 
CoroLuary. 


(25) Ki; = i(M — t)(j — 1)77ealt — 1) — &)-a(8)). 


This follows immediately from (23), which gives Ki; = )oi-o &;-1(k — 1), 
and (21). 

The theorem is not completely new. It restates the eigenvalues given already 
in Theorem 1. The first five eigenvectors in (23) agree, except for multiplicative 
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constants, with those of Theorem 2(i). Also the relationship between pre- and 
post-eigenvectors is found to be K* = (i — 1)ij7 '(M — j) "Ky, using (24) 
and (25), and this verifies Theorem 2(ii). 


With these preliminaries, we take up the problem of absorption time for the 
population. 


THEOREM 5. The probability of first absorption at time t, given an initial state i, is 
(}M] 
S{? = 2i(M — i)M*)>> {{1 — 27(27 — 1)M 7] 5-1(0) 
j=l 
(26) *[Eja(t _ 1) - £2;-1(7) }}, t= 1, 2, 3, eed 
¢=0,1,2,---,M 
where |}M]\ is the integral part of 3M. 
Proor. From (10), (23), (24) we have 
s® = KA‘ *K'S” = Cua’'a'C'S”, t= 1,2,3,---, 
where, by (7), (15), 
s” = (0, (M — 1)M™,0,0, --- , 0, (M — 1)M™, 0). 


Multiplying out the matrices involved, we have, for the ith element, 


M 
SP = (M —1)M*Y {ll — 5G — 1) MY Esa) 
= 


— §n(1) + &.(M — 2) — &4(M — 1)] 
-i(M —i)(j —1)°7 Tsai — 1) — &a(a)]}, 
t= 1,2,3,---, 


¢=0,1,2,---,M. 
Because 


(28) &1(0) = +§;.(M — 1), (1) = +&,.(M — 2), 


depending on whether j is odd or even, only terms with j even need be included 
in the summation. Hence we replace j by 2j, and add to the term with j = 
[4M], the integral part of $M. From (21), (28), we have that 


(M — 1)M[;-1(0) — &j-4(1)] 
(29) (M — 1)M[f&j;4(M — 2) — &j.(M — 1)] 
2j(2j — 1)M~*és;4(0). 


Substituting for (28), (29), into (27) gives the expression (26) of the theorem. 

While (26) seems to be the simplest form for the absorption probability at 
general time t, for ¢ reasonably small, a direct evaluation of (10) could be used. 
For example, with ¢ = 1 we know 


Ss}? _ (ba + CF u-)(M vr 1)M~”, 
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and hence we have the 
COROLLARY. 


($M) 
(Sa + d:u-1)(M — 1)M™* = 2i(M —i)M™ )D- 


j=l 
*{&5-1(0) [Ejau(t — 1) — &jn(t)]} * = 0,1,---, M. 
This, and other identities for the orthogonal polynomials can be obtained from 
CEA8/C" = P' t= 0,1,2,--- 
We note that (26) agrees with (5), in so much as the right hand side is zero when 
t = 0 or i = M, for all ¢t 2 1. 


TuHeoreM 6. The probability generating function for the first absorption time dis- 
tribution is 


{[}™] 


Gi(z) = 80 + 5,4 + 221(M —i)M” >> 
(30) j=1 


{{1 — Qej(Qj — 1)M*}"&;-1(0) [&ja(¢ — 1) — &ja(2)9}. 
Proor. By definition, 


= 5x0 + 6: + 2), s}".*"*. 
t=1 


Substituting for the S{° from (26), and summing the geometric series involved, 
gives (30). 
Because G,(z) is a probability generating function, we must have G,(1) = 1 
for all 7, and hence 
COROLLARY. 
[tM] 


2i(M —i)M™* Do §{l — 29(27 — 1)M Yb 5-(O)[faj;a(¢ — 1) — &ja(d))} 


j=l 
= 1 — bo — diy, ¢=0,1,---, HM. 
The generating function in Theorem 6 must be consistent with (13), although 
this is not obvious. One can obtain all the moments of the absorption time 7’; by 
suitable differentiation of G,(z). Thus 
THEOREM 7. 


E(T;) = (d ‘dz)G (z) 
(31) 2 = 
= 2M*i(M — i) 2) §[2j(2j7 — 1) *&51(0) [&sa(é — 1) — &sa(i)I. 


j=l 


z=] 


(4m) 
Var (T;) = 4M*i(M -—1) >> 


(32) j=l 


{[25(2j — 1)]*éo;-1(0)[ées-1(¢ — 1) — f(a) ]} — ECT.) — (E(TOF. 
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Proor. The expression (31) for the expected value of 7; is immediate from 
(30). For the variance, we have 


Var(T;) = E(T(T; — 1)| + E(T,) — (E( TdF 
= (d’/ dz*)G;(z) | 1 + E(T.) — (E(TOF. 
But 


($a) 
(d’/dz*)Gi(z) | = 4M*i(M — 7) a ([29(2j — 1)J “tL — 25(25 — 1)M™| 
ta 


* £;-1(0) [fo5-n(@ — 1) — &j-4(2)]}, 
a) 

= 4M*i(M — i) D> {[2j(27 — 1)}*t5-4(0) 
j=l 


*[Eeja(t — 1) — &j4(t)}} — 2E(T), 
by (31). Hence, we obtain (32). 

While the above discussion is sufficient to solve all problems of interest, the 
expressions obtained are not simple to use in practice, even assuming that M is 
sufficiently small for the values of £;(k) to be available in tables. We shall now 
show how the moments of 7; can be obtained in terms of elementary functions 
by using approach Section 2(b). Here we assume that the initial state 7 is not 
absorbing and consider the truncated matrix P, . 

Differentiating (13) with respect to z, and evaluating the result at z = 1 gives 


(d/dz)G,(z) | m1 = [2 (2 I, — Py) “len(I, — Py) 1, 
(33) 
= (I, - P,) “1, : 
This equation was given in [2], p. 378, ex. 17, and in [6], p. 51, with different 
notations and derivations. In [6], (1, — P,)~’ was called the “fundamental 
matrix.” It was used in [6], p. 177 in the genetics problem of a family tree with 
non-random mating, whereas here we are concerned with the entire population. 


Higher moments can be obtained similarly. For example, for the second fac- 
torial moment we have 


(d’ / dz’)Ga(z) | 1 
(—22*(21, — Py)? + 22°%(2°1, — Pa) Jeni(Iy — Paddy 
= —2(1, — P,)7 ha + 208, — P,) 2, 
= 2[(I, — Py) — I)(d/dz)G,(z) | 1. 


This formula was given in [6], p. 51, and a genetics example worked in [6], p. 177. 
For the particular population model (15), we have 

THEOREM 8. The first two moments of the absorption time, given an initial state i, 
are 
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i M—i—1l 


= (M —i) > (1-—jM")'+i > GQ —jm")", 


j=l j=l 
M—i—1 
j 


Var(7;) = (2M(M — i) > +2Mmi > 
k=1 k=1 f 
M—k—1l1 il 


k 
(36) yd (I — jM")"*+k(M —k)" + ( —jM™) 
j=l 


j=l 
— E(T,) — (E(THFP, 
where the terms in braces are to be multiplied symbolically to give four double sums. 


Proor. Omitting the first and last rows and columns of P defined in (15), 
we see that (I, — P,) can be written as the product 


1(M — 1) 
2(M — 2) 


The inverse is therefore the product of two symmetric matrices (I — P) 


M-1 M—2 M—3 + Ss 2 ] 
M—2 2(M—2) 2(M — 3) ‘ 2 


M 


wr Se - OS 0... 4 + Os 8 


3 a M—2 


(MM — 1)~ 
"(mM — 2)" 


(mM —1)"1" 


Substituting into (33) gives (35) for the 7th element, and into (34) gives the ith 
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element 
d'G,(z) | M—-1) 


= fom(m —i1) S+2mi ¥ 
aot an (M i) 2 + 2Mi a f 


M—k—1 


k 
: 1p (1 — jM")" + k(M — k)™ dX (1 —jM oa — 2K(T;), 


from which (36) follows. 

Coro.uary. Equating (35) and (36) with (31) and (32), respectively, re- 
sults in two identities for the orthogonal polynomials. 

If M is small, (35) and (36) appear to be preferable to (31) and (32), but 
in actual populations M is large and approximate procedures are required. 
These are discussed below. 


4. Approximations and the diffusion equation. When the population size M is 
large, the Markov chain can be approximated by a diffusion process continuous 
in space and time. We make the time scale transformation u = Mt and 
the state transformation y(u) = M~‘x(M*u). Since x = 0, 1, 2,--- , M, we 
have y = 0, M",2M™",--- , 1 and letting M — but keeping u fixed, it can 
be shown that the distribution of y(u) approaches a distribution function which 
has jumps at y = 0 and y = 1 but is differentiable in the open interval (0, 1), 
see [12], [13] for similar examples. In other words, the discrete variable y(u) 
has an approximately continuous distribution within (0, 1) for sufficiently large 
M. Write the derivative of this distribution as f(y, wu); then it may be shown that 


af(y,u) _ #y(1 — y)fly, u) 


ay? for O<y<1l 


and apart from the accumulations of probability at y = 0 and y = 1, f(y, u) 
behaves as an approximate density for y(u). This equation is a special case of 
the ‘‘Fokker-Planck diffusion equation,” and requires for its unique solution a 
specification of the initial function f(y, 0). 

Further, following [8], or [12], the probability that the diffusing state is ab- 
sorbed at y = 1 at or before time u is given by the backward equation solution 


a°G(p, u) 
oy =” 
where p = y(0) = M'‘x(0) = iM™, and the boundary conditions G(0, u) = 0, 


G(1, uw) = 1, must be satisfied for all wu > 0. The solution of this equation is 
(see [8], eqn (5.3) with a different notation) 


aG(p, u) _ 


oa p(l — p) 


G(p,u) = p 

(37) = ° 4 ° ° (j+1)u 

+ 2 (2j + 1)p(1 — p)(—1)F — 5,5 + 2, 2, pe. 
j= 


From (37) we see that the probability of ultimate absorption at y = 1(2 = M) 
is limy.. G(p, u) = p. This result happens to be exactly correct for the 
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discrete Markov process, for the ultimate value of x(t) can only be 0 or M, 
and it is easily checked from (15) that E(2(t)) remains constant throughout 
time, and therefore E(z(«)) = 1 = pM. This is therefore one aspect of the 
model’s behaviour that the diffusion approximation predicts exactly. 

Also, from (37), we can find the probability of absorption in either state at 
or before u by symmetry. It is 


G(p, u) + G(1 — p, u) 
= 1+ >) (23 + 1)p(1 — p)(—1) 1F(1 — j,i + 2, 2, p) 
j=l 


+ F(l —Jj,J v 2, 2, i- p)ie ae 


and hence the probability of absorption at exactly time ¢ (on the old scale) is 
approximately 


s}° = G(p, Mt) + G(1 — p, Mt) — G(p, M(t — 1)) 
— Gil — p, M(t — 1)) 


(39) = 2) (25 + 1)p(1 — p)(—1)1F(1 — 5, i + 2, 2, p) 
j=l 


a Here ee ee | 


where p = iM‘. An exact formula for S}” was given in (26) but comparisons 
for the accuracy of (39) as an approximation seem hopeless. In any case (39) 
seems no easier to compute than (26). 

The moments of the first absorption time 7; can be calculated with consider- 
able difficulty from the approximate distribution (39), but for the mean a simpler 
procedure is the following. Write U(p) as the expected value of the time— 
measured on the u-scale—for one or other absorbing state to be first reached. 
Then Feller [3] states that U(p) is the solution of an ordinary differential equa- 
tion which reduces to 


p(1 — p)[d'U(p)/dp’| = —1 

in our case, with the boundary conditions U(0) = U(1) = 0. The solution is 
U(p) = log[p ?(1 — p)"”J, 

or measured on the ¢ scale and with p = iM, 

(40) E(T;) = M loglp ?(1 — p)"”J. 


Comparing this with the exact result found in (35), we see that the diffusion 
approximation is equivalent to replacing the Riemann summation in (35) by 
integration, and hence the approximation should be good for all but very small 
values of M. 


Writing V(p) as the second moment of the distribution (38), that is 


V(p) = [ u? dlG(p, u) + GU — p, u)], 
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one can show that V(p) is the solution of the equation 
p(t — p) EY) . _2u(p) 

dp’ , 


where U(p) is as above, and V(0) = V(1) = 0. The solution is 


©. nt! a +1 
V(p) = am et = 2 Pp =? vr ee 2 log [p (1 ide p)“ »). 


Making the mean correction and reverting to the t-time scale, we have for the 
variance of the first absorption time, 


Var(T;) = M‘V(p) — [E(T,)?, p=iM. 


It may be verified that this approximation is got if, in the exact formula (36), 
the summations are replaced by integrations, and terms of order less than M* are 
ignored. Thus for both the mean and the variance, the diffusion approximation 
should be adequate for all but very small M. 

Other aspects of the model’s behaviour have been given in [9], [10]. 


Acknowledgment. The author thanks Mr. J. E. Moyal for suggesting the use 
of the eigenvector expansion (9), and Professor E. J. Hannan for discussion of his 
derivation of the eigenvalues leading to Theorem 1. 
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ESTIMATION OF THE SPECTRUM! 
By V. K. Murruy 
Stanford University and University of North Carolina 


0. Summary. This paper extends some results of Grenander [1] relating to dis- 
crete real stationary normal processes with absolutely continuous spectrum to 
the case in which the spectrum also contains a step function with a finite number 
of saltuses. 

It is shown by Grenander [1] that the periodogram is an asymptotically un- 
biased estimate of the spectral density f(\) and that its variance is [f(A)]° or 
2[f(A) |’, according as \ ¥ 0 or \ = O. In the present paper the same results are 
established at a point of continuity. 

The consistency of a suitably weighted periodogram for estimating f(A) is 
established by Grenander [1]. In this paper a weighted periodogram estimate 
similar to that of Grenander (except that the weight function is more restricted) 
is constructed which consistently estimates the spectral density at a point of 
continuity. 

It appears that this extended result leads to a direct approach to the location 
of a single periodicity irrespective of the presence of others in the time series. 


1. Introduction and preliminary lemmas. We shall now proceed to establish 
our results. 
Let x(n) be a discrete, real, stationary, normal process. It is known (Karhunen 


[2]) that the process can be decomposed into two mutually orthogonal stationary 
processes as x(n) = x(n) + 22(n), where x,(7”) is a purely periodic process and 
42(n) is a purely non-periodic process. 

Let [x(—N), x(—N + 1), ---, e(-—1), x(0), x(1), ---, a(N — 1), 2(N)] 
be a realization of size 2N + 1 from the process x(n), and consider the statistic 
proposed by Grenander [1], 


l = tvr 

(1.1) Id) = 5 a(vje "|. 
7 2n(2N + 1) 2, 

This is the usual periodogram based on the realization. We have 


1 N i \ N ¥ 
Iy(y) = — t(vje |) + 3 > a(ve” 
a 2n (2N 1) 2, ss Qn (2N + 1 ) 1 -N " ~ 


; sine ~ tvr 
+ Qn(2N + 1) o r1(v)e )( 2. Xo(v)e ) 


vow—N 


=~ iv 


v=a—N 
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The two stationary parts, 2:(n) and x2(n), have the spectral representations. 


fon) = F e'™ dz(X), 


(1.3) 
Jasin) = | e'™ dzo(X), 


where z,;(A) and z(A) are orthogonal processes. 

We shall use the following two lemmas. 

Lemma 1 (Karhunen [2]). Jf 2(s) is an orthogonal process with the associated 
measure o(s) on the subsets s of the elements () of W, and if g:(X) and ge(d) are 
complex valued functions of the real variable \ such that each of them is quadratically 
integrable on W with respect to the o-measure, then we have 


(1.4) E| f. gi(X) dz(d) [ 9.0) ae(n) | = [ gi(A)yo(A) do(A) . 


where do(X) = E}dz(d) dz(nr)}. 
Lemma 2 (Grenander [1]). For any discrete, real, stationary, normal process 
with absolutely continuous spectrum, it was shown that 


4 » = 1 = + 1)(l — d)/2] 
(1.5) E [Iy(d)] = IxQN +1) rs), a fd. 


# [* sint[(ON + 1)(1 — )/2] 
RN = aN ET L.— mte—-nay 2? at] 


+ aig [ sin [(2N + 1)(1 — A)/2] 
2x(2N + 1) sin {(2 — d)/2) — 


= ent rat f() at| 
where D*{I y(X)] denotes the variance of I y(X); also that 
cov [Ty(A), In(u)] = RyQ, pw) 
(1.7) é er 1)(l — d)/2) 
— — h)/2] 
ay 5978 Yyn a, 


l * sin ((2N + 1)(l -- d)/2] sin [(2N + 1)(1 + »)/2 | 
3 Ee +1) I sin{(l—d/2). + sin (+ way ha ; 


where f(d) is the spectral density. 





2. Expectation and variance of Jy(\). Using Lemmas 1 and 2, it is easily seen 
that for our processes, i.e., for discrete, real, stationary, normal processes whose 
spectrum includes, besides the absolutely continuous part, a step part with a 
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finite number of saltuses, 
I * sin’ [(2N + 1)(l — d)/2] 
A | 7 i= Bd, 
UW) 2n(2N + 1) 4. sin? [(l — d)/2] 
te es : sin’ [(2N + 1)(1 — )/2] 
2nr(2N + 1) LL, sin? {(l — A) /2] 


2 a * sin? ([(2N + 1)(1 — A)/2] P 
er ls rit mid—naqy | O + a0) 


[ sin[(2N + 1)(1l — A)/2] 

Ly sin [(l — A) /2] 

sin[(2N + 1)(l + A)/2] 
sin [(l + A) /2] 

(2.3) cov [Iv(A), In(u)] = Rv(d, w) = RY? (A, w) + Rv? (A, 2), 


do,(1) 


do2(l) ; 


I 
| aoN +1) 


2 
’ 


aor) +x) | 


where 


] * sin [(2N 1)(l — d)/2 
Kiss... | ee 
2r(2N +1) JL, sin [(l — d)/2] 
_ sin [(2N + 1)(l — yp), 2] 
sin [(l — yw) /2] 
2 l * sin [((2N 1)(1 — d)/2 
R2(,n) = si ee | sin [( Bs dl ; ) ; ] 
2n(2N + 1) 1 sin [(l — d)/2] 
_ sin [((2N + 1)(1+ pw) /2] 
sin [(l + yu) /2] 
From the nature of the two parts 2,(n) and 2.(n) of the process x(n), their 
spectra o;(A) and o2(A) are respectively a pure step function and an absolutely 
continuous bounded measure function. Also it is evident that the spectrum 


o(X) of the process x(n) is the sum of o;(A) and o2(A) the spectra of the two 
parts. 


(2.4) 2 
d(o,(l) + at)) | ’ 


(2.5) 2 


d(o,(l) + ad) | 


3. Asymptotic unbiassedness and inconsistency of Jy(\). We shall now prove 

TuHeEorEM 1. For any real, discrete, stationary, normal process whose spectrum 
consists of an absolutely continuous part and a step function with a finite number 
of saltuses, Iy(X) is an asymptotically unbiased estimate of f() at every point of 
continuity of o(X). 

Proor. Let S;, S2, --- , S, be the steps of o;(\) corresponding to the values 
hi, Av, «°°, Ap Of Ain (—z, x). We have from (2.1) 


iP I F sin’ [(2N + 1)(l — d)/2] 
EUwQ)] = 2n(2N +1) 4, sin? {(Z — d)/2] 
a ek aik [ sin’ [(2N + 1)(1 — d)/2] 
2n(2N + 1) L- sin? [(l — A) /2] 


do,(1) 
(3.1) 
f(D) dl, 
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where 
(3.2) do,(l) = f(l) dl. 
The first term on the right-hand side (R.H.S.) of (3.1) can be written as 
P aoa T 
(3.3) 1 ss sin’ [(2N + 1)(, )/2) 


2n(2N+1) Si" sin? (A(x — d)/2] 
If \ is a point of continuity of the spectrum (A), it does not coincide with any 
one of \ , k = 1, 2, --- , p, and hence all the p terms in the above expression 
are finite. As N — « the above expression tends to zero. By Fejér’s theorem the 
second term on the R.H.S. of (3.1) tends to f(A) as N — o. We have thus 
established that 


(3.4) limyo E[Zw(d)] = f(a), 


at a point of continuity. 

THEOREM 2. For any discrete, real, stationary, normal process whose spectrum 
consists of an absolutely continuous part and a step function with a finite number 
of saltuses, the variance D*{Iy(X)] is equal to [f(A)}° or 2[f()} according as \ ¥ 0 
or \ = O at a point of continuity of the spectrum. 

Proor. From (2.2) 


sin" [(2N + 1)( — )/2] 


: J 1 ® 2 
DO) = rey Lan Me +) | 


+ 1 [ sin ((2N + 1) (1 a d) /2] 
2x(2N + 1) Ls sin [(l — d)/2] 
sin [(2N + 1)(1 + d)/2] | 
. wae go — dal 
sin [( + r) 2] (ox( ) + o2( )) 
By an argument like that of the previous theorem, the first term on the 
R.HLS. of (3.5) tends to [f(A)]’ at point of continuity of o(A). In the second 
term the contribution of the term containing o,(l) tends to zero as N — , so 
that we have only to investigate the nature of 


1 ; [ sin [(2N + 1)(l — d)/2] 
2n(2N + 1) 


: sin [( — d)/2] 
sin [(2N + 1)(1 + d)/2] 
ante yay 


Case I: \ = 0. In view of Fejér’s theorem it is easily seen that (3.6) tends to 
[f(A) feo as N > @. 

Case II: \ ¥ 0. We divide the range of integration (—-7, +) into six parts 
as follows letting \ > 0: (—2z, —A — €), (—A — €, -—A + €), (—A + «, 0), 
(0,A — ¢), (A — &,X + e), and (A + e’, x), where «, e’ are small, arbitrary, 
positive constants. Denote the corresponding integrals by J, , I: , Is, Is, Js and 
I, . Applying the first mean value theorem, it is easily seen that J, , J; , J, and I. 


(3.6) 
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tend to zero as N — o. Consider 


(3.7) Ts 1 f sin [(2N + 1)(l — )/2] 


~ 29QN +1) da sin{(l — d)/2] 


_ sin [((2N + 1)(1 + d) /2] 
sin [(l + A)/2] 





f dil. 
Putting 1 — \ = t, we have 


sin [(2N +1) (t) /2] 


is A Se r d 
2n(2N + 1) Le’ sin [t/2] 


sin [((2N + 1)(¢ + 2A) /2] 
-—“mlet may + ae 
_ tf snlew + DO 
~ Qe(2N + 1) do sin|t/2] 
; sin [(2N + 1)(2 —#) /2] 
sin [((2\ — t)/2] 


1 * sin [(2N + 1)(t)/2] 
+ 2n(2N + 5 f sin [t/2] ; 


_ sin [(2N + 1)(t + 2a) /2] 
sin [(t + 2d) /2] 


I, = 


(3.8) 
f(\ — 8) dt 


f(\ + t) dt. 


Hence 


k ‘| sin ((2N + 1)(t)/21 | 
<= Seema souaciicisieeiindiii oe! a 
’ 5 570N +1) [ sin [t/2] a 


I 


k * | sin ((2N + 1)(t)/2] | 
S 2n(2N + 1) I . sin [t/2)} dt, 


which can be written (Zygmund [3] p. 67) as 
(3.9) Ts < [k/(24r(2N + 1))JO(log N). 
Hence limy... J; = 0. Similarly limy.. J. = 0. 


Therefore the expression (3.6), when A ~ 0, tends to zero as N — «. We 
thus established that, at a point of continuity 


=r +0, ~~ limyie D*[Tv(A)] = [f() 


while at \ = 0, limy... D’[Jy(A)] = 2[f(A) ino. Thus, except in the trivial case 
f(A) = 0, Jy(A) is not a consistent estimate of the spectral density at a point of 
continuity of o()). 


4. Consistency of the weighted periodogram estimator. We will now try to 


construct a weighted consistent estimator for the spectral density at a point of 
continuity. 
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Consider 


N 


N 
(4.1) Iy(A) = 5cGN TT) a a(v)e z x(v)e. 
Since z(v) is real, it is easy to verify that Jy(\) = Iw(—X), i-e., Jw(A) is an even 
function of \ in (—z, 7). 

Let w(A) be an even function of \ such that, within (0, 7), w(\) vanishes 
outside (A, + h) and h is so chosen that the h neighborhood of \, does not contain 
any saltus of o(A). 

Consider 


7 Aoth 
(42) ft.) = [ Iy(1)w(1) dl = oT Iy(1)w(1) dl. 


Taking expectations on both sides of (4.2), we have 


oth 
(43) EUf%(.)] = 2 [ “ Blliy(n)]}w(A) ad. 


Taking limits as N — « we have, at a point of continuity, 
Noth 

(4.4) lim BUSH(X)] = 2 f° f(0w(i) al. 
N>o Ao—h 


Adding the condition for fy(A) to estimate asymptotically unbiasedly f(A) at a 
point of continuity \, of (A), we have 


hit 
(4.5) 2 [~ sQ)w(t) dl = J). 


If f(A) does not vary too much in the neighborhood of \,, the approximate con- 
dition for asymptotic unbiassedness, is 


Aoth 
(4.6) [© wor) ar = 4. 

THEOREM 3. Let w(d) be a continuous weight function satisfying the conditions 
imposed in Section 4 and (4.6). Let the spectral density f(X) be continuous. 
Then, at a point of continuity d, of o(d), the variance of the weighted estimator 
fx(Xo) goes to zero as N > ~, 

Proor. We have from Grenander [1] that 


4x (2N + 1)°Df¥(Xo) = a, r(n + m)r(k + 1)W(n + k)W(m + 1) 


—N 


(4.7) 


N 


+ Dd r(n+m)r(k+)W(m+)Want+h, 


n,m,k,l 
—N 
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where 


. 


(4.8) r(n) = [ e” do(X) = | cos nd do(d), 


(4.9) W(n) = | e wr) dd = | cos n\w(A) da. 


Since w(d) is an even function we have 


4x°(2N + 1)°D*(fx(d.)) 
N 


cso) =2 >> r(in+m)r(k +)W(n + k)W(m + 1). 


n,m,k,l 
—N 


Again following Grenander [1] we have 


Qn°(2N + 1)DUfs(d0)] < >> r(a)r(B)W(y)Wla + B — x) 
ap. 


(4.11) x | x t(n)W(n + »| > r(n)W(n — |. 


y=—2N n=—2N n=—?2 
Case 1. do(X) = f(A)dd 


where f(A) is an even function, being the spectral density of a real process. We 
have 


N oo 
f(A) = lim Dy r(n)e*™ = >> r(n) cos nd, 


N~o —N 


N 0 
4w(A) = lim > W(n)e"™ = >> W(n) cos nd, 


N~o —N 
N 


d)w(rA) = lim > d(n)e™, 


jv \ 
\ N>o —N 


where 
2 


(4.13) d(v) = Dor(n)W(n + v) = DS W(n)r(n + »). 


—s 


Let us write d’*(v) = 7 as r(n)W(n + v). We have from (4.11) that 


2N 
(4.14) Qn°(2N + 1)Dfe(d.) < Dd {d*(r)}’. 
—2N 


Taking the limit as N — we have, since 
> d(r) < @, 


—o 


limy.« D’ [f¥(Ao)] = 0. 
do(X) = f(A) ddA + doi(A), 
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where o;(\) is a step function with a finite number of saltuses S,, S:2, --- 
at A\1, A2, --* , Ap respectively. 
We have from (4.8) 


Pp 
(4.15) r(n) = ry(n) + > COS NA; . 


t=] 


We have from (4.11) in another form 


Qn (2N + 1)D*(fx(&)) < = | > W(n)r(n + »| 


‘| > W(a)r(n — »| 
n=—2N 


v=—2N n=—2N 


v=—2N n=—2N n=—2N 


2N 2N p 2N 
= 2 ‘| > Wn)rni(n+v)+>S8; >> Wn) cos nem | 
t=] 


. | > w(n)r(n — v) +> S; >> Wn) cosn — mt 
n t=] 4 


n=—2N 


=—2N 
2N p 


(4.16) = > 2(a(»))* + dr) +S, 


v=—2N dae 


2N 
- >} W(n){cos nd; cos vA; + sin nd, sin vA,] 


n=—2N 


2N 


‘ Pp 
+ d*(v) > S; >> W(n)[cos nd; cos vA; — sin nd; sin rrJ 


t=] n=—2N 


2N 


P 
+ >) 8:8; >> W(n)[cos nd; cos vA; + sin nd; sin vA,] 


1,j=1 n=—2N 
2N 


> W(n)[cos nd; cos vA; — sin nd; sin mit ; 


n=—2N 


But we have, in view of the conditions imposed on the weight function, that 


= W(n) cos ni; = W(A;) = 0, 


2 


{>° W(n) sin nd; = 0, 


—w 


2 


> dr) < @. 
Taking limits on both sides of (4.16) as N — , and taking into account (4.17), 
we have, at a point of continuity of o(\), that 
limy.0 D*[fx(r.)] = 0, 


which proves the theorem. 
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ON A COINCIDENCE PROBLEM CONCERNING PARTICLE COUNTERS 


By Lasos Taxics' 


Columbia University 


1. Introduction. A general model of particle counting will be considered. 
Suppose that particles arrive at a counting device at the instants 7; , rz, --- , 
tT, *** , Where the inter-arrival times tr, — ta; (n = 1, 2, --- ; ro = O) are 
identically distributed, independent, positive random variables with distribution 
function P{r, — ta S z} = F(x), n = 1, 2, --- . Suppose that each particle, 
independently of the others, on its arrival gives rise to an impulse either with 
probability p(0 < p S 1) if at this instant there is at least one impulse present 
or with probability 1 if there is no impulse present. Let g = 1 — p. Denote by 
xn the duration of the impulse (if any) starting at r, . It is supposed that {x,! 
is a sequence of identically distributed, independent, positive random variables 
with distribution function 


l—e” if x20, 
(1) H(2) = {4 i 2<e 


and independent of {7,} and the events of realizations of the impulses. 

Denote by 7(t) the number of impulses present at the instant ¢. Always 
(0) = 0. We shall say that the system is in state E, , k = 0, 1, 2, --- , at the 
instant ¢ if (t) = k. Write P{n(t) = k}= P,(t). Furthermore, denote by v;”’ 
the number of transitions E, — E,., (k + 1 -fold coincidences, k = 0, 1, 2, ---) 
occurring in the time interval (0, ¢]. Write E} vi) = M,(t). 

The stochastic behavior of the process {n(t);0 < t < } is characterized by 
two parameters, p and yu, and the distribution function F(x). Throughout this 
paper y» will always be fixed and only p and F(z) will vary. For the sake of 
brevity we shall say that the process {n(t);0 < t < ©} is of type [F(z), pl. 

In what follows we shall give a method to determine the distributions of the 
random variables n(t) and »;" for finite ¢ and the corresponding asymptotic 
distributions as t > « . The above mentioned problems for process of type [F (x), 1] 
were solved earlier by the author [13], [14]. The present model of particle counting 
in the particular case of Poisson input was introduced by G. E. Albert and L. 
Nelson [1] and generalizations have been given by the author [10], [12], R. Pyke 
[7], and W. L. Smith [9]. 


2. The structure of the process, {7(/)}. The stochastic behavior of the process 
of type [F (zx), 1] is already known [14]. Now we shall show that the investigation 
of the process of type [F(x), p) can be reduced to that of the process of type 
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(F(x), 1). For this purpose let us associate a new process with the process of type 
[F(x), p] by supposing that each particle independently of the others gives rise to 
an impulse with probability p, but otherwise every assumption remains un- 
changed. This new process can clearly be considered as a process of type (F(z), 1], 
where 


(2) F(z) =p), q” 'F,(x) 

and F,(2) denotes the nth iterated convolution of the distribution function 
F(x) with itself. It is easy to see that the only difference between the processes 
of type [F(x), p] and [F(), 1] is that the latter contains an additional interval 
spent in state Ey immediately before every transition Ey) — E, , where the lengths 
of these intervals are identically distributed, independent random variables with 
distribution function 


(3) Q(x) = pd a"F a(x) 
and these random variables are independent of any other random variables in 
question. Here Fo(x) = 1 if « = Oand Fyo(x) = O if e < 0. Thus, knowing the 
stochastic behavior of the process of type [F(), 1] we can determine that of the 
process of type [F(x), pl]. 
It is to be remarked that the process of type [F(x), p] is parkovien only in 
z 


particular cases (e.g., F(a) = 1 — € “ for z = 0; F(z) = j~o (1 —p)p’ 


where 0 < p < 1; F(x) = lifz = aand F(z) = Oif x < a), but the instants 
™Tm™,n = 1, 2, --- , always form the regeneration points of the process. Accord- 
ingly for fixed k,k = 0, 1, 2, ---, the instants of the successive transitions FE, — 
E,4; form a recurrent (or renewal) process, i.e., the time differences between 
successive transitions E, — E,4, are identically distributed, independent, posi- 
tive random variables. Let us denote by R,(x) their common distribution func- 
tion. Furthermore it is clear that the time differences between successive transi- 
tions E,_, — E, and FE, — Ey,4,, k = 0, 1, 2, --- , are also independent random 
variables. Denote by G,(2), k = 0, 1,2, --- , their distribution function. (We say 
that a transition E_, — E> takes place at time t = 0.) 


3. Notation. We mention in advance that for the process of type [F(zx), 1] we 
shall use the same symbols as for the process of type [F (2x), p] but with the cir- 
cumflex added. 

Throughout this paper we shall use the following symbols: 


r xz dF(zx), 


0 


[ es aA 
0 


[ e-* dF(z), 
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v(s) = [ e* dG,(z), ¥(s) = 
¥(s) = r e* dR,(2), R(s) = 
tay me . e* dM,(t), R(s) > 0, 


m(s) = [ e P(t) dt, R(s) > O. 
0 


1s 1 (5) 


where ¢; = $(ju),7 = 0,1, 2, ---, and Cy = 1. 
Finally, we introduce a new random variable 7, = (7, — 0) which is equal 
to the number of the impulses present at the arrival of the nth particle. 


Furthermore 


4. The determination of the distribution of n(). First we shall prove the follow- 
ing 

Lemma 1. (R. Pyke). If Mo(t) is the expectation of the number of transitions 
Ey — E, occurring in the time interval (0,t| for the process of type [F (x), p]\, then 


wee - —st = $(s) 
w(s) = [ e* aM(t) = 2 


(5) Yo(s) = 1 =e) _ “a |= (—p)' I] (-# + t) )} ; 


1 — o(s + in) 

Proor. This lemma in two particular cases, when either p = 1 or F(x) = 
1 — e™ if x = O, has been proved earlier by the author [10], [11], [12]. A proof 
for the general case has been given by R. Pyke [7]. Now we shall give another 
proof. 

By using renewal theory we obtain 


P 


r=() t= 


(6) M(t) = Go(t) + Go(t)*Ro(t) + Go(t)*Ro(t)*Ro(t) + --- 


and here Go(x) = F(x).Forming the Laplace-Stieltjes transform of (6), we get 


(4). It remains only to determine Yo(s). For this purpose consider the associated 
process [Ff (x), 1]. Then we have 


- sf —8 9 $(s) 

) [ eat) = 2 _, 
. ° : 1 — (8) 
where, by (2), 


(8) 
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and, in this case, 
a Pa “, oP ( s + ip) 
(9) (g) = [ e' dM,(t) = (—] (* Ps ). 
ee 40 of :! y I — o(s + ip) 


Formula (9) follows by [14] where we showed that, if 1/(t) denotes the expecta- 
tion of the number of transitions Hy) — EF, occurring in the time interval (0, ¢] for 
a process of type [F(x), 1], then 


e —st ( o(s a ip) 
(10 I 1M (t) oy’ (74 8 ), 
0 ve -> ( I — o(s + wt) 


If we replace ¢(s) by $(s) in (10), we obtain (9). Comparing (7), (8) and (9) 
we obtain 


a pes) |> po(s + tm) \y 
(11) (s) =1— (—1)" (™ — — ; 
vc (1 — go(s)] 2X I — (8 + tp) 


On the other hand taking into consideration what we mentioned in Section 2 we 
have 


Ro(x) = Q(x) * Ro(2) 
where Q(x) is defined by (3). Thus 
te) p 
(12) os) = — os) 
vos 1 — q¢(s) vols 


By (11) and (12) we get Yo(s). This completes the proof of the lemma. 
RemarK |. The proof of (10) is simple. For a process of type [F(x), 1] we have 


(! +(M(t—y) —M(z)] if y+2 St, 
E{y. |n=y,x =2} = <1 if ySt<yrtsz, 
lo if y>t 


and by the theorem of total expectation we get 


M(t) = F(t) + [ M(t — y)(l — e*”) aF(y) 


t 
-| M(2)F(t 
0 


Forming the Laplace-Stieltjes transform of (13) with 


(13) 


u(s) -[ e*' dM(t), 
0 


we get the functional equation 
¢(s) 


(s) = 
oe — o(s) 


[1 — w(s + »)), 
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whose solution is 


Now we shall prove 


THEOREM 1. The disiribution {P,(t)} is determined uniquely by the following 
Laplace transforms 


aye (r\ pw (ols + in) ) 
(15) [ e'P,(t) dt = 2 ( " (;) s+ Tye IT (; wet o(8 + ip) 
; 1-9 (-») [] (eet) 


r=0 1 — (8s + ip) 
ifk = 1,2, ---,and 


0 r r—1 . 
aS s+i 
al 1 2, (1) ‘0G oor) 
0 e 8 op 
1 _ gprs 
a2, (-p) 1 (; ete) 
where the empty product means 1. 

Proor. Consider the process of type [F(x), p] and denote by C;,(t), k = 
1, 2, --- , the probability that the system is in state FE, after a time ¢t measured 
from a point of transition Hy) — E, and during this time interval of length ¢ 
there are no other transitions Ey) — E, . Clearly this probability is the same for 
the process of type (F(x), 1]. Thus by the theorem of total probability we obtain 


(17) P,(t) = [ Ci(t — u) dM(u), 2 ind, & si 


and similarly 
~% 


(18) P,(t) -[ C.(t — u) dMo(u), k = 1,2,-- 


if we take into consideration that the event that the system is in state E, at 
the instant ¢ can occur in several mutually exclusive ways: the last transition 
Ey, — E, in the time interval (0, ¢] is the 1st, 2nd, --- , nth, --- , and this transi- 
tion takes place at the instant u(0 S u &S 2). 

Forming the Laplace-Stieltjes transforms of (17) and (18), we get 


(19) wits) = ails) [ e" dCy(t) 


and 


(20) = ols) Y e" dC,(t). 
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Comparing (19) and (20) we obtain 
(21) wi(s) = #4(8) 22) , 
fio(s) 
and thus 
(22) mo(s) = f(s) ol 8) + ] (1 — male) ) 
fio(s) $ fio(s) 


also holds because 


k=0 


In [14] we have determined z,(s) for the process of type [F(x), 1]. If we 
replace ¢(s) by ¢(s) there, then we obtain #,(s), namely 


oo _ r r—1 8 2) 
23) #(s) = —1)"* ) ee (H+) ssa eh 
(23) #(s) 2, ( 1) E = fas k = 0,1,2, 


On the other hand yo(s) is defined by (4) and (5) and f(s) by (9) and thus 


po(s) _ [1 _ al — 9(s)) fas) | 


fig(s) po(s) 
= E -e2 (-9' (e+ a )} ; 


imt \l — o(s + ty) 


The formulas (21), (22), (23) and (24) prove the theorem. 
REMARK 2. Using a well known Tauberian theorem we get that 


= wee ey 

(25) Pt = lim “| P,(u) du, 
too t Jo 

exists and 

(26) Py = lim,.o sm(s). 


Ifa < ~ then {Py} isa probability distribution for which 





(27) Py =i — 


op 1 —4q > (-py'¢, | 


r=( 


2 a Mais 
> (-1)"p (; 1) Ge 
P= oa alos k = 1,2,--- 


kop E —qd( -»'¢, 


r=() 








PARTICLE COUNTING AND COINCIDENCE 745 


We shall show later that if a < © and F(z) is not a lattice distribution then 
lim:../.(t) exists and then obviously lim,;..P,(t) = Pr, k = 0, 1, 2, 

5. The determination of the distribution of vy’. Knowing the distribution 
functions Go(x), Gi(x), --- , Gi(x) and R,(zx), the distribution of v% can be 
determined easily. We have 
(29) Pir? > n} = Go(t) *G,(t) «--- #G,(t) « R(t) * --- * R(t) 


where the right hand side contains the nth iterated convolution of R,(t). 
Define 


(30) i. - | x dR,(2) 
0 


and 


(31) c= [ (x — m)* aRu(2). 


If o; < «, then we have 


(a3 t/ pi) 


to 


(k) t i 
(32) lim P S pn = ; = (an) | eo ay 


as is well known in renewal theory. (Cf., W. Feller [4], W. L. Smith [8], and the 
author [11].) 

Thus the problem is reduced to the determination of the distribution functions 
Go(x), Gi(x), --- , G(x) and R,(x). We shall prove 

THEOREM 2. We have 


far i —sz —_ D,(s) 
(33) v-(s) I ee” dG,(x) " has’ 


and 


vi(s) = [ e dR,(x) 
- {1 - oS (-oy fy (Het) 


i ee r=0 i=] o(s + ip) 


Du(s) {3 (1) (4) TT (Pee + Nt 


1=0 i— o(8 + ip) 





where Do(s) = 1 and 
7 (1 ise +) 
= 2 () 0 (Germ 


_ dl — sls A y! TT (? — o(s + ))} 
po(s) x r(- 1) Th, po(s + in) /f 
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We shall prove Theorem 2 in two parts. First we shall determine D,(s), 
r = 0,1, 2, --- , and then y,(s), r = 0, 1, 2, --- . But we should like to remark 
here that the mean p;, and the variance o; of R,(x) can be calculated by (34) if 
we take into consideration that 


2 2 
(36) vi(s) =1—p_s+ a s° +0 (s*) 
as s — 0. Since 


k+1 . Y. k+1 i j-1 
Duss(s) = p +003 (* 41) Ct — og 2 (REN S (4 is 


j=l J P j=0 J b=1 


we pect) PS (1) (;) 
= | 1) () IL (; (s + cp) 5 | 1) k 
y(-p (;,) vc. + +25 (-"*(F) re, d ot) 


— i= O(tu)(1 — o(tu)] 
and 


1-9 (-p)' 2 (+ o(s + in) yaa ~eb (prc. 


es o(s (s + ip) ro 


Ne x aes y Wahi = aed ie o(tu)] +s) 


as s > 0, therefore 


af - 13; (-nVc,| 


p> (-1(") ) vc, 


rank 


and 


k+l /p, 4 k+l /p. 1\ = 
oh = Om S25 (FF 1) Sees _ ons’ (FF 1) 5 (1) Ses 
> 


J pr Pp” i=0 
q y a —p)'C, > ae i ee 
&, §—?) Cr 2 Sail — ol 
l-—q dX (—p)'C, 
528 or (ve 


jk a ts = 


1 “a > (—p)’C, 


2 > (-1)* (; )we, + (ip) 


i= $(tu)[1 — o(¢u)]\ 
l-—q : (—p)'C, 





es 
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6. The determination of D,(s). In this section we shall suppose more generally 
than formerly that each particle independently of the others on its arrival gives 
rise to an impulse with probability p, if r impulses are present. Write g, = 1 — p,. 
The process of type [F (x), p] corresponds to the particular case when pp = 1 and 
Pr=p,r=1,2,---. 

As before denote by G,(x), k = 0, 1, 2, --- , the distribution function of the 
distance between two consecutive transitions E,., — E, and Ey — Ex4,. (We 
say that a transition E_, — E, takes place at time t = 0.) Define 


, - —sz D,(s) 
€ : = => ———— 
(39) v(8) I e * dG,(z) Duala) 


where Do(s) = 1. Thus we must determine D,(s), r = 1, 2, -- 
We note that if we write D,(s) in the following form 


(40) beet ") A’D)(s) 


where A’Do(s) is the jth difference of D,(s) at r = 0, i.e., 
(41) ADs) = >> (—1)** (:) Dis) 
t=(0 


then D,(s) is uniquely determined by its differences. 
Now we shall prove 
THEOREM 3. Starting from Do(s) = A°Do(s) = 1, the functions D,(s), r = 


0, 1, 2, --- , and the differences A’Do(s), 7 = 0, 1, 2, «++ , can be obtained succes- 
sively by the recurrence formulas 


 (-1)"% (‘) D;(s) 


j=0 


(42) ‘ 
= o(s + ju) 2 (—1)"? (") (p; Djas(s) + gq; Dj(s)] 


and 
i o(s+ju) ¥ (?) it 
‘ = .pe;4" D 
(43) A’Dol 8) = 5 Geet jay De NG) HM” Dole) 
respectively. Here 


(44) m= ¥ (-1)" Pa Pi-- 


v==() 


Proor. By the theorem of total probability we can write that 


(45) G(z) = | a 


j=0 


(") e (1 — &™)” [p; Giaa(a — y) # «++ # G(x — y) 


+ q; Gz — y) *--- *G,(a — y)] dF(y), 


if r = 0, 1, 2, --- , where the empty convolution is taken to be 1. To prove (45) 
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let us consider the instant of a transition Z,., — EF, , and measure time from this 
instant. Then G,(x) is the probability that the next transition Z, — E,.; occurs 
in the time interval (0, x]. This event may occur in the following mutually exclu- 
sive ways: the first particle in the time interval (0, x] arrives at the instant 
y(O < y S 2) and it finds state E; ,j7 = 0,1, --- , *, the probability of which is 


(") ey — eo) 
J 


further in the time interval (y, z] a transition E, — E,,, occurs, the probability 
of which is 
piGijn(a — y) *--- # G(x — y) + 9G (a — y) * +++ # G(x — y). 
Introduce the notation 
f r e —s= — —pr\r—Jj 
(46) Gr.j(8) = (‘) | e"e ™(1—e™)"’ dF(z) 
0 


and form the Laplace-Stieltjes transform of (45); then 


r 


(47) v(s) = > ar,(s) |» I] ; v(s) +q; I] n(o) | 


3=0 t=j 


(r = 0, 1, 2, ---) where the empty product is 1. Now using (39) we find 
(48) D.(s) = >> ar.i(8)[pDiss(s) + gDi(s)], 
t= 


r = 0, 1, 2, --- . This is already a recurrence formula for the determination of 
D,(s), r = 0,1, 2, --- , but the coefficients can be simplified further. 
If we form 


j d 
(49) A’D,(s) = >) (—1)*' () D,(s) 
l=0 


where D,(s) is replaced by (48) and take into consideration that 


(50) (-p* (’) ls) = (—1) (:) $(s + jx), 


lai 


then we obtain 


(51) A’Dy(s) = o(s + ju) D> (-1)" (:) [p: Dias(s) + 9: Di(s)]. 


; 


Now comparing (49) and (51) we obtain (42). 
On the other hand by (51) it follows 


A’Do(s) = o(8 + ju)A’Do(s) + o(s + ju) D> (—1)** (:) p; AD;(s) 


t=0 
whence 


A’D,(s) = o(s + ju) x A’ [po AD,(s)], 


1 — o(s + ju) 
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A’ [p, ADo(s)] = © (’) oj: AD, (8) 


t= 


ji ne v j-t 
ci =A ps = D (—1) ~ Pi - 


v=n() 
This proves (43). 
Tue Proor or (35). In the case of a process of type [F(x), p] we have po = 1 


and p, = p,r = 1, 2, --- . In this particular case (43) reduces to the following 
difference equation 


52) aps) — 1 — ols + Ju) as 19) — As) 
(52) A™D)(s) - YE a’Di(s) + (-1)' De 


= 0, 1, 2, --- . A simple calculation shows that the solution of the difference 
ian (52) is 


ine. 5. 11 — (8 + tn) 
on (5 Cem) ) 


_ gil — o(s)) 4 > ru YT (t — (s+ »)) 


pols) i=1 imt+1 po(s + ip) 


(53) 


(Cf., Ch. Jordan [6]) and finally 
(54) D,(s) = 0 (‘) A’D)(s) 
j=0 


which completes the proof of (35). 
Remark 3. If specifically we consider the process of type [F(x), 1] when 
pr = 1,r = 0,1, 2, --- , then (35) has the following simple form 


it! 1 — 9(s + ju) Ai “ae face 
A D,(s) = ois +ju) + ju ) A’D,(s), 2 0, 1, 2, 


whence 


A’D,(s) ies Il te o~ $(s + #)) 


i=0 o(s + tp) 


(55) nj (‘) Il C5) = »)) 


j=0 t=O o(s + ip) 


in agreement with our previous result [14]. 


7. The determination of y,(s). First we shall prove the following 
TueoreM 4. Jf M,(t) denotes the expectation of the number of transitions E, — 
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Ex4, occurring in the time interval (0, t| at the process of type [F (x), p| then we have 


S —_ (s+i 
: 26-1) (;) 0 (Eee ) 
(56) m(s) = [ e“amy(t) = = oa Se ONE 
0 ae ip) 
: E(- (2 — (8s + ip) 

Proor. Evidently the difference of the number of transitions a — FEi,.4, and 
Ex.4, — E, occurring in the time interval (0, ¢] is 0 or 1 according to whether at 
the instant ¢ the system is in one of the states Ey, E,, --- , Ey or in one of the 
states Exi:, Ezy, «++ respectively. Accordingly if we denote by Ni4:(t) the 
expectation of the number of transitions E,,,; — E, occurring in the time interval 
(0, t] then we have 


(57) M,(t) — Nieai(t) = a P;( 


j=k+1 


On the other hand 


t 
Niegi(t) = (kK + 1) | Pyas(u) du. 
0 


For, if we consider the process {»(t)} only at those instants when there is a state 
E,4; then the transitions E;,,,— E, form a Poisson process with density (k + 1)y. 
Hence 


x 


t 
(58) M,(t) = (k + 1)a | Pysi(u) dut+ D> P(t). 
0 


juk+1 
Forming the Laplace-Stieltjes transform of (58) we obtain 
(59) ue(s) = (kK + 1)uma(s) +s >> 2;(8) 
j=k+1 
Similarly if we consider the process of type [F(x), 1] then we have 
we 
(60) f(s) = (k+ 1)ptins(s) +8 D> #;(s) 
j=k+1 
Now comparing (59) and (60) and using the relation (21) we get 
oa (s) , 
(61) wa(a) = M2) 4 .(s) k =0,1,2,---. 
Ho(s s) 
In [14] we have showed that 
(62) i.(s) = >> (—1)™* (.)I (eet), k = 0,1,2, +>: 
, > ) IT 1 — o(s + ip) ; 
and we have seen earlier that 


© Bf kewl (MSE8T" 


r=0 ixi \l — os — i) 
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Thus (61), (62) and (63) prove (56). 
ReMARK 4. By a well known Tauberian theorem it follows that 


(64) lim mae! = lim sy,(s) 
t+o 870 


and thus by (56) we obtain 


M,(t) Pp » (—1)"* (;) pc, 
(65) lim ; ee . —. 
“8 a -qd (—pY'¢ | 


r= 





This result can be obtained also by former results of this paper. Thus by (58) 
we obtain 


(66) lim M(t) = lim ~ 


to t+ 


St ie [ Prss(u) du = (k + 1) Phys 


where P; , k = 1, 2, --- , is defined by (28). Further we can conclude by re- 
newal theory that 


(67) tim MM) 2 
to t Pk 


where p; is defined by (37). For, the time differences between consecutive transi- 
tions E, — E+; are identically distributed, independent random variables with 
expectation p, . 

Tue Proor or (34). By using renewal theory we have 


(68) M,(t) = Go(t) *G,(t) «--- *G,(t) 
* (I(t) + Ri(t) + Ri(t) * R(t) + ---] 


where J(t) = lift = Oand I(t) = Oift < 0. Forming the Laplace transform of 
(68) we obtain 


ay vo(s)yi(s) «++ ve(8) a 1 
(69) ue($) = 1 — va) Desa(s){1 — ve(s)] 





whence 
(70) ¥e(s) = 1 — [Desi(s)me(s)J’ 
where y,(s) is defined by (56) and Dy4:(s) by (35). This proves (34). 

8. The limiting distribution of n(¢). We shall prove 

TuHeoreM 5. If a < «© and F(x) is not a lattice distribution then the limiting 
distribution lim... P,(t) = Pe, k = 0, 1, 2, --- , exists and is defined by (27) 
and (28). 

Proor. At Remark 2 we showed that if a < © then 


ta 


t 
(71) lim * P,(u) du = PX, 
0 
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Now if we show that lim;.,, P,(t) exists then by (71) we get that lim;..P,(t) = 
Pf . To prove the existence we need the following auxiliary theorem: If F(x) is 
not a lattice distribution then 


(72) in aD) ee 
too 


exists for every h > 0 and is independent of h. This statement follows from a 
theorem of D. Blackwell [2], since if F(z) is not a lattice distribution then the 
distribution of the distance between successive transitions E;, — E;,4,; is also a 
non-lattice distribution. If (72) exists then it clearly agrees with (66), i.e., if 
a < and F(z) is not a lattice distribution then for every h > 0 

ws fim Met +h) — M(t) _ 


h (k+1)pPtu, %k=0,1,2,--: 
to 


where P;, k = 1,2, --- , is defined by (28). 
Now by the theorem of total probability we can write that 


C) t : 
(74) P,(t) = > | (;) ey — eo H™) 1 — P(t — u)] dMj-1(u), 
j=k 40 


k = 1,2, --+ , where the distribution function F(x) is defined by (2). To prove 
(74) let us note that the event that the system is in state FE; at the instant ¢ can 
occur in several mutually exclusive ways: the last transition in the time interval 
(0, t] is E34 > Ej; ,j7 = k,k +1, --- ; thisis the nth (nm = 1, 2,---) among the 
transitions E ;., — E; ; this transition takes place at the instant u(O0 < u S t); 
and in the time interval (u, ¢] no new impulses are starting, but 7 — k impulses 
terminate. 
The function 


e™(1 — e&™) "11 — F(x)] 


is of bounded variation in the interval (0, ©) and so it follows from (73) that 
the limit of (74) exists and we have 


(75) lim P,(t) = » >, Pj () | e (1 — & *)*"*11 — F(x)] dz, 
j=k 0 


to 


k = 1,2, --- . The limit may be formed term by term, the series being uniformly 
convergent. Finally, lim;.., Po(t) also exists, because 


P(t) = 1— >> P,(t). 
k=] 


This completes the proof of the theorem. 


9. The limiting distribution of 7, . We shall prove 
THeEoreEM 6. The limiting distribution lim,.. P{n, = k} = P,,k = 0,1, 2, --- 


-. 
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always exists and 


(76) P, = > (—1)"" 4 B, 


r=aok 
where B, is the rth binomial moment of | P,}. We have By = 1 and 
‘ie see 
1 — gD) (-1)’p’C; 


j=0 


(77) ~~ 


Specifically 


p X (-1)'p'C, 
(78) ped 


l—q 2 (—1)"p'C, 
r=( 


Proor. Define 


x(x) = p ( : ') e™*(1-—-e")™* +4 () on —  ™) 


ifj = 1,2,3,--- , and 
mo(x) =1l—e™, ma(z) =e”, wma(x) = 0 if k>1. 
It is easy to see that the sequence of random variables {y,}, n = 1, 2, ---, 
forms a Markov chain with transition probabilities P{9,4. = k| . = Jj} = pie 
where 


(79) i * l wu(x) dF (2). 


The Markov chain {»,} is evidently irreducible and aperiodic. By a theorem of 
F. G. Foster [5] we can prove that the states are also ergodic. Consequently the 
limiting distribution lim,.. P{n, = k} = P.,k = 0, 1, 2, ---, exists and is 
independent of the initial distribution. The limiting distribution {P,} is uniquely 
determined by the following system of linear equations 


(80) Pr = D paP;, 
j=k-1 

and 

(81) ya P, = 
k=0 


(Cf., W. Feller [3]). In (80) P_,; = 0 
To solve this system of linear equations let us introduce the generating func- 
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tion 


(82) U(z) = > Pet. 


a) 


By (80) we obtain 


U(z) = p ‘ (l1—e” + 2”™)U(1 —e” + ze”) dF(z) 
(83) ' 


— gP(1 — 2). + af U(L — &* + ze) dF (2). 


Now let us introduce the binomial moments 


(84) nes @ Py, 


k=or 


of the distribution { P,}. If B, exists, then by (82) we have 
(85) B, = 1(¢ uis)) aa 
z=] 


r! dz" 
By (81), Bo = 1. Forming the rth derivative of (83) at z = 1 we obtain 
B, = poi( Bi + Bo) + pPobi + gBidi 
if r = 1 and 
(86) B, = po-(B, + Bra) + 9¢-B, 
ifr = 2,3, --- . Hence 
(87) B, = pC,(1 + (qPo/p)), 


where P» is still to be determined. The probability distribution { P,} is uniquely 
determined by its binomial moments, namely by (82) and (85) 


(88) P, = 5 (Se = > (-—1)"* (;,) B,. 


Since 


(89) Po - i (—1)'B, = 1 + (1 + : ) > (—p)'C,, 


r= r=1 


consequently 
Pp dX ( —@ ) "C, 


(90) Py = —*___ 
i- q 2d (—p)'C, 
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and by (87) 
CO 
Ce ae 
i 
b nig pa (—p)’C; 


j=0 


(91) 


The theorem is proved by (88) and (91). 
Remark 5. By (67) and the theory of Markov chains we can conclude 


° M,(t) SPo/a if k = 0, 
9 a = 
(92) im — le ft bm i Bie 


Further we have seen earlier that 


(93) lim Nesstt) = (k + 1)pPhu, k = 0,1,2,-:- 


> 
t+o 


where Pf is defined by (28). Obviously 0 S M(t) — Niy(t) S 1 for every 
t = O and thus (92) and (93) agree. Accordingly a simple relationship exists 
between the distributions {P,} and {Py}, namely 


p* — PP 
kop 
PT = Po 
om 


if k= 2,3,--- 


Pi =1- DOP. 


k=1 
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ON THE RUIN PROBLEM OF COLLECTIVE RISK THEORY’ 


By N. U. PRasyu 
University of Western Australia 


0. Summary. The theory of collective risk deals with an insurance business, 
for which, during a time interval (0, t) (1) the total claim X(t) has a compound 
Poisson distribution, and (2) the gross risk premium received is Xt. The risk 
reserve Z(t) = u + At — X(t), with the initial value Z(0) = u, is a temporally 
homogeneous Markov process. Starting with the initial value u, let T be the 
first subsequent time at which the risk reserve becomes negative, i.e., the business 
is “ruined”. The problem of ruin in collective risk theory is concerned with the 
distribution of the random variable 7’; this distribution has not so far been 
obtained explicitly except in a few particular cases. In this paper, the whole 
problem is re-examined, and explicit results are obtained in the cases of negative 
and positive processes. These results are then extended to the case where the 
total claim X(t) is a general additive process. 


1. Introduction. The theory of collective risk, as developed by the Swedish 
actuary Filip Lundberg, deals with the business of an insurance company. 
Following a series of papers published by him during the years 1909-1934, a 
considerable amount of work has been done by Cramér, Segerdahl, Tacklind, 
Saxén, Arfwedson and many others; a survey of the theory from the point of 
view of stochastic processes was given by Cramér [2], [3] and an excellent review 
has recently been given by Arfwedson [1]. Briefly, the mathematical model used 
in this theory can be described as follows. 

(a) The claims occur entirely “at random’’, that is, during the infinitesimal 
interval of time (t, t + dt), the probability of a claim occurring is dt and the 
probability of more than one claim occurring is of a smaller order than dt, these 
probabilities being independent of the claims which have occurred during (0, ¢). 

(b) If a claim does occur, the amount claimed is a random variable with the 
probability distribution dP(x) (—*° < g < ), negative claims occurring in 
the case of ordinary whole-life annuities. 

Under the assumptions (a) and (b), it is easily seen that the total amount 
X(t) of all claims which occur during (0, t) has the compound Poisson distribu- 
tion given by 


(1.1) K(az,t) = Pr {X(t) s zx} +t P,(2), 


ée —_— 
n=0 n! 


where P,(x) is the n-fold convolution of P(x) with itself, and Po(z) = 0 if 
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x < Oand = 1 if x 2 0. The expected claim during (0, t) is given by ta, where 
(1.2) a -[ zdP(zx); 


ta is called the net risk premium. 

(c) During an interval of length t, the company receives an amount Xt from 
the totality of its policyholders; Xz is called the gross risk premium. The differ- 
ence, \ — a, is called the “safety loading”, which is in practice positive. However, 
we shall not assume this, but only that \ and a are of the same sign. The ratio 
p = ha (> 0) is called Lundberg’s security factor, and is of great importance 
in the theory of collective risk. 

The function Z(t) = u + A — X(t) is called the risk reserve, with the 
initial value Z(0) = u. Clearly, Z(t) is a temporally homogeneous Markov 
process with the transition distribution function 


(1.3) P(u;z,t) = Pr{Z(t) S$ z|Z(0) = u =1— K(At+u—z,?). 


Starting with the initial value u, let 7’ be the first subsequent time at which the 
risk reserve becomes negative, i.e., the company is “ruined’’. We shall call 7’ 
the “period of prosperity”. The ruin problem of collective risk theory is con- 
cerned with the distribution of the random variable T. Let us denote by 
(1.4) G(t,u) = Pr{T st (Ost < ~) 
the cumulative distribution function (c.d.f.) of T. The function 

F(t, u) = 1 — G(t, u) 


gives the probability that ruin occurs only after time ¢, i.e., the risk reserve 
Z(t) remains non-negative throughout the interval (0, ¢). By considering F(t, u) 
over the consecutive intervals (0, dt) and (dt, dt + t) we obtain the relation 


F(t + dt, u) = (1 — dt)F(t,u +2 dt) 
(1.5) 


utrdt 
+ at [ F(t, u + » dt — x) dP(x) + o(dt) 


which, on simplification, yields the integro-differential equation 


oF 


(1.6) Ot 


~ = + F(t,u) = re F(t, u — x) dP(z) 
with the initial condition F(0, u) = 1 for u = 0. This result is due to Arfwedson 
[1]. Taking the Fourier transforms (F.T.’s) of both sides of (1.6) with respect to 
t, we obtain an integro-differential equation for the F.T. of F(t, u), whose solu- 
tion, on inversion, gives the function F(t, w). However, this procedure has not 
led to explicit expressions for F(t, w) except in some particular cases. 

The expression 


(1.7) ¥y(u) = G(o,u) = Pr{T < ~} 
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gives the probability that a company with initial capital u will eventually be 
ruined. Proceeding as in (1.5) we obtain the integro-differential equation 


(18) —ay"(u) + ¥(u) = 1 — Plu) + [ o(u — 2) aPC), 


which is essentially the same as the integral equation obtained by Cramér [1]. 
Further, let us set F(u) = F(o, wu) = 1 — ¥(u); F(u) gives the probability 
that the ruin never occurs. 


2. The probability of ruin for a negative process. Let us first consider the 
case of an insurance company which deals only in ordinary whole-life annuities. 
Here all the claims are negative, and the process is sometimes referred to as a 
“negative process”. If we put X(t) = —X(t), and B(x) = 1 — P(—z),0< 
xz < , then the distribution of X(t) is 


(2.1) K(z,t) = > e*— By(2) (0<2<~), 
n=0 : 
its Laplace transform (L.T.) being given by 
(2.2) [ e d, K(x, t) = exp {—#[1 — 9(6)]}, 
0 


where (6) is the L.T. of dB(x). Here a < 0, and, since \ and a are of the same 
sign, \ < 0; without loss of generality we can take X = —1, so that the risk 
reserve in this case becomes Z(t) = u — t + X(t). 

Now consider the queueing system M/G/1, in which (a) the inter-arrival 
times of the customers have the negative exponential distribution e~‘ dt(0 < t < 
«© ); (b) the queue discipline is “first come, first served”; and (c) there is only 
one counter, and the service time has the distribution dB(t)(0 < t < ©). It is 
easily seen that the total service time of customers joining this system during 
the time interval (0, ¢) has the distribution (2.1), and that this total service 
time is steadily exhausted by the server at a unit rate except when the counter is 
free (see Prabhu [6]). The busy period of the server initiated by a waiting time 
u is thus seen to be analogous to the period of prosperity in collective risk theory, 
the arrival of a customer corresponding to the occurrence of a claim, and the 
service time of a customer corresponding to the amount claimed. Using this 
analogy and the results obtained by Prabhu [7] we see that the joint probability 
distribution of the length of the period of prosperity 7’ and the number of claims 
settled during this period is given by 


n— 


t 1 
(2.3) G,(t, u) = | eu — dB,(r — u), n= 0,1,2,---. 


n! 


Hence we obtain the expression 


(2.4) Git,u) = f Xo aBlr - w) 


n 
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for the distribution of 7’, a result which has not been explicitly obtained before. 
Further, we obtain 


(2.5) [ e' d,G(t,u) = 7" 


0 
as the L.T. of this distribution, where (@) satisfies the functional equation 
(2.6) n(0) = 6+ 1 — dn(6)}. 


(See Takdcs [8].) From (2.5) it follows that the probability of the eventual 
ruin of the company is given by 


. te s4 
¥(u) = G(x, u) = 


c= fas 
a result due to Lundberg; here p = |a| is Lundberg’s security factor, and R 
is the largest positive root of the equation 


(2.8) R=1-— 9(R). 


EXAMPLES. 
(a) Let B(x) = Oif x < p, and = 1 if e = uw. Then B,(z) = Oif x < my, 
and = 1 if x = mp, and (2.4) gives 
N 


= ' —r ur” —(u+ny) u(u + np)” 
(29) G(t,u) = | e* —- dB,(r —u) = ew a 
n=) “wu . 


n=0 n! 


where N = [(t — u)/y] is the largest integer contained in (t — u)/uy. 
(b) Let B(x) = 1—e”(0 S x < a). In this case 


dB,(x) = e “[(ux)" . (n — 1)!]u dz, 


/ 


and 


n=l (n — 1 yt 


(2.10) 


Y —u > ; —r U n—1 —p(t—u) n (r ao uy” 
G(t, u) e+ . an % s ey 
t 


e+ wue™ | er ule — u)r} dr, 


u 


where 


n 


(2.11) J(z) => = 


n=() (n!)? 


is a Bessel function (see Arfwedson [1], part I1, equation 152). 


3. The case of a positive process. We next consider the case where all claims 
are positive (a “positive process’). Let us change slightly the notation used in 
Section 1, and denote the distribution of X(t) by (2.1). Here a > 0, and we 
may take \ = 1. The risk reserve is then given by Z(t) = u + ¢ — X(t), and 
the required probability is 


(3.1) F(t,u) = Prju+-7— X(r) 20100 S 7 S bt)}. 
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Now consider the transition of the process Z(7) from the initial value Z(0) = u 
to a non-negative value Z(t) 2 0, the probability of which is 


(3.2) Pri{u+t— X(t) 2 0} = K(t+ uw, ?t). 


Such a transition can occur in two mutually exclusive ways: (1) negative values 
are not assumed throughout the interval (0, 4), and (2) negative values are 
assumed. In the latter case it is clear that a transition from a negative value to a 
positive value must occur at some point in (0, ¢); let the last such transition 
occur at time 7, so that during the remaining interval (7, t) only non-negative 
values are assumed. These considerations lead to the relation 


K(t + u,t) = F(t,u) + | F(t — 7,0) dK(¢ + 1) 
0 


or 


t 
(3.3) F(t,u) = K(t+u,t) — [ F(t — 7,0) dK(r + u, 1). 
0 
Here we have set 
(3.4) dK(t + u,t) = [d. K(2, t)]eetsu. 


To obtain F(t, 0) which appears on the right side of (3.3), let us consider the 
transition of the process Z(r) from the initial value Z(0) = 0 to the value 
Z(t) = xz > O, such that Z(r) > 0(0 < +r &S t). Let F(t, 0; x) dt denote the 
probability of such a transition. It is obvious that for every such transition of 
the process Z(r) there corresponds a transition of the process Z(r) = Z(t — r) 
from the initial value Z(0) = x to the value Z(t) = 0 such that Z(r) > 0 
(0 < +r < t). However, Z(r) = « + X(r) — 7 is the process which has already 
been studied in Section 2, so that 


oo n—1 
(3.5) F(t, 0;z) dt = >> os" -dB,(t — 2). 


n= nl 
Hence we obtain 
2 C n—1 t 
(3.6) F(t,0) dt = at | F(t,0;2) dz = > aK f x dB,(t — x) de. 
0 : z= 


n=0 


Further, using (2.5) we obtain the L.T. of F(t, 0) as 


(3.7) [ e F(t, 0) dt ot | em dz pe he , 
0 0 n(8) 
where 7(6) is given by (2.6) (cf., Arfwedson [1], part I, equation 44). 
The integro-differential equation (1.6) reduces, in the case of the positive 
process, to 


OF OF - 
(3.8) ao, + Plu) = [ F(t, u — x) dB(z). 
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This equation is the same as the one satisfied by the c.d.f. of the waiting time 
in the queueing system M/G/1 (a result due to Takacs [8]), and its solution 
(3.3) has been obtained by Prabhu [6]. In the present context, however, the 
solution has been obtained by straight-forward arguments. Nevertheless it is 
interesting to see the connection between the two situations. Following Gani 
and Pyke [4] it can be proved that the waiting time W(t) of a customer who 
joins the system M/G/1 at time ¢ is given by 


(3.9) W(t) = sup [X(r) — 7], 
Osrst 


where we have taken W(0) = 0. Hence 
Pr {W(t) s u} = Pr {sup [X(r) — 7] Su} 
Osrst 


= Pr {inf [u + 7 — X(r)] 2 O} = Fit, u). 
Osrst 


From this it follows that F(u) = F(, u) is the limiting waiting time distribu- 
tion, which is known to exist if a < 1; its L.T. is given by the Pollaczek-Khint- 
chine formula 


eS —Ou a (1 — a)6 
(3.11) E e dF(u) = 7-14 6 


Letting 6 — + in the above fo: we obtain 


(a < 1, R(6) > 0). 


(3.12) l — VY :) = F(O) = 1 _ a; y¥(0) 


while inversion of (3.11) yields the result 


(3.13) vu) = (1a) f Det Blt + w) 
t= n=0 * ° 
as obtained by Prabhu [6]. The explicit results (3.3), (3.6) and (3.13) have not 
been obtained before. 
EXAMPLES. 
(a) Let B(x) = Oif e < aand = 1if x 2 a. Then 


No pve No-1 
aot 


(314) F(,0) = Set ame +(-a) Det’, 


n! em Nv! n=0 
and 
n N2 ae n 
F(t,u) = == er. (na — u)" u) 
! Ny 


n' 


(3.15) 


+ (1 — a) 


(t + u — na)**” Nee! (t+ u — na)’ 
[as ~ vl *| 


where No = [t/a], Ni = [w/a], No = [(t + u)/al], [x] being the largest integer 
contained in z (cf., Arfwedson [1], part II, equation 135). Further, if a < 1, 
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(3.16) Wu) = (1 — @) Sere (nae = 4)” 


n! 


(b) Let B(x) =1—e”"(0 S x < ~). Here 


pa ! von 901 Sent 
(3.17) iti hae 
om AF e” J( pry) dy. 
y=0 


1 — K(z,t) = > on — B,(x)] = e at > _ Sen. 


= 7c (ez)" = Pat 
van) v n=v+1 nN: 


We have 


n—l n—l 
t 


2 t 
—¢ —t pe ae —pr 7n x 
F(t,0) =e + de os (t—ax)e™p Som pi 


whence we obtain 


t 


(3.18) G(t,0) =e” [ e "J (ptr) dx + e‘p [ ex J'(ptx) dex. 
0 0 


The expression for G(t, uw) can be simplified similarly (cf., Arfwedson [1], part 
II, page 85). Also, if p > 1, 


(3.19) Wu) = (1 2 ‘) es Fe e417" (ot? + put) dt. 
p 0 


4. A generalized model. Recently, certain alternative specifications for the 
occurrence of claims have been made (see Arfwedson [1]) ; however, the reasoning 
used in deriving the various formulae of Sections 2 and 3 indicates that the total 
claim X(t) can be a general additive process with stationary increments (of 
which the compound Poisson distribution is a particular case). If then we try to 
write down the integro-differential equation of the type (1.6) for this generalized 
model, it will be found necessary to characterize the process X(t) over the 
infinitesimal interval (0, dt), which may present difficulties of a serious nature. 
However, such difficulties can be avoided, at least for processes which are purely 
negative or purely positive, by using the methods of the present paper. 

Consider first the case of a negative process. Its risk reserve is 


Z(t) =u+X(t) —t, 


where we now take X(t) to be an additive process with stationary increments, 
with the continuous frequency function k(z, 1)(0 S x < ~,0 St < o~). It 
is known that the L.T. of X(t) is given by 


(4.1) [ e*k(2, t) dx = e ®™, 
where §(@) is a function of a specified type. The risk reserve can then be com- 


pared to the content of a dam which is fed by inputs of water forming an ad- 
ditive process with stationary increments, and from which there is a steady 
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release of water at a unit rate except when the dam is empty. The period of 
prosperity 7 of the insurance company is then analogous to the “‘wet period” 
of the dam. It follows from the results of Kendall [5] that the frequency function 
of T is given by 

(4.2) g(t, u) = (u/t)k(t — u, t) 

and its L.T. by 


(4.3) I e "g(t, u) dt = o™. 
where 7(@) satisfies the functional equation 


(4.4) n(@) = 6 + &n(@)} (@>0). 


Further, from (4.3) and (4.4) it follows that the probability of the eventual 
ruin of the company is 


1 if p2l 


er £ ge <t 


(4.5) ¥(u) = | 


where ¢ is the largest positive root of the equation r = £(¢) and p is Lundberg’s 
security factor. 

Next, consider the case of the positive process, with the risk reserve 
Z(t) = u+t— X(t) where X(t) is as defined for the negative process. It is 
clear that the reasoning employed in obtaining F(t, 0) in Section 3 is valid for a 
general additive process X(t), so that we have 


(4.6) F(t, 0) = I + k(t — x,t) dz. 


The equation (3.3) remain valid, so that F(t, u) can be completely determined 
in the general case. 


5. Acknowledgment. I am grateful to Dr. Arfwedson (Stockholm) for many 
helpful comments on a preliminary draft of this paper. 
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THE RANDOM WALK BETWEEN A REFLECTING AND AN 
ABSORBING BARRIER 


By B. WrEsAKuUL 
The University of Western Australia 


1. Introduction. In this paper, the classical problem of random walk restricted 
between two barriers at 0 and b is discussed. A particle, starting from the initial 
position u on the z-axis (0 < u S b an integer) at ¢ = 0, moves one unit to 
the left or right of its position at times ¢ = 1, 2, --- . The probabilities for the 
moves are respectively g and p(q + p = 1), the moves being independent. We 
assume that the barrier at 0 is absorbing and the one at b reflecting so that (i) 
when the particle reaches the barrier at 0, it is absorbed and the process termi- 
nates (ii) when at any integral time 7(r 2 b — u), the particle is at the barrier 
at b, there is a probability p that it remains there at the next instant (+ + 1) 
and a probability q that it moves one unit to the left. 

Random walk problems have been extensively studied (see Feller [1]), and 
their application to the theory of Brownian movement has been discussed by 
Kac [2] among others. With the assumption that there is one reflecting barrier 
at 0 and the other at ~, Kac was able to derive an explicit expression for 


P(n, m |\ 8), 


the probability that the particle starting from position n is at m after time s has 
elapsed. Other cases where both barriers are absorbing and where both barriers 
are reflecting have also been discussed by Feller [1]. We are concerned in this 
paper with the case where one barrier is absorbing and the other reflecting; we 
shall derive the expression for the generating function of the probabilities of 
absorption. 


2. Generating function for the probabilities of absorption. Let g(t | u) be the 
probability that the particle reaches the barrier at 0 for the first time (thus 
being absorbed) at time ¢ starting from the initial position u at t = 0. The 
probability g(t | u) satisfies the difference equation: 


g(t|u) = g(t -—1|u—1)¢+g(t—1|u+1)p, 
(u=1,2,---,b—1;¢=1,2,--- ) 


(1) 


where g(0|0) = 1 and g(t|u) = O fort < u. For u = b, we have 
g(t|b) = g(t -—1|b—1)qg+ g(t — 1|b)p. 


Let P(u) be the 1 X b row vector (0---0q0p0---0) with q being the 
(u — 1)th component, and let G(t — 1) be the b X 1 column vector of elements 
g(t — 1\2), (¢ = 1, 2, --- , b). Then equation (1) may be written in the matrix 
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form as 

(2) g(t|u) = P(u)G(t — 1). 

A further application of the difference equation (1) immediately leads to 
(3) g(t\u) = P(u) QG(t — 2), 

where Q is the b X b matrix defined by 


and where, as before, G(t — 2) is the column vector of elements g(t — 2| 7), 
(i = 1, 2, --- , b). By successive applications of (1), it follows that 


(4) g(t|u) = P(u)Q*°*G(1). 


Let ¢(@|u) = > tuo 6’g(t| u) be the generating function for the probabilities 
of absorption. We have from (4) that 


3 ¢(0| wu) 0° >> P(u)(0Q)'G(1) 
(9) t=0 
@P(u)(I — 0Q)"G(1) 
provided @ lies in such a range that max [|@p|, |@q|] S 1. 
We note that G(1), being the column vector of elements g(1 | 7), has the first 


component q, all other elements being zero. It follows that the right hand side 
of (5) may be written as the ratio of two determinants, namely 


(6) 





~ 11 = 6Q] 


The determinant |D| in the numerator is the same as |J — 0Q| except that the 
first row is replaced by P(u). We first evaluate the determinant |J — 6Q| in 
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the denominator. Consider an n X n determinant A, of the form similar to 


\T — 6Q| except that the (n, n)th element is 1. Then for this determinant the 
following recurrence relation holds: 


(7) A, = A,-1 ae OpgAn-2 ; (n = 2, 3, se 2 
where A; = 1 and Ag is defined to be 1 for convenience. Writing 

&. ‘i 1 —6 pq Buin = @ Reis 

hdd tc ae 0 Bae: maar” 


it follows from (7) immediately that 


(8) (359 = x i ) 


The two characteristic roots \;, \2 of the matrix S are found to have distinct 
values 


dh = H1 + (1 — 46%pg)'}, be =4[1 — (1 — 46°pq)'). 
Writing S in the spectral form 


with B= (» >) and BU’ = (i — my*( . ey , it follows from 
1 


(2-20 x) (i) 
so that A, = (A: — Az) [AT " - a * Hence we have finally 
\T — 0Q| = A, — OpAp1 
= (dr — Ae) AT — Ag"? — Op(ar — A2))- 


To evaluate the determinant |D| in the numerator, we first add to its first row 
the uth row multiplied by @", thus reducing it to (0 --- @'0---0) where 
@' is in the uth position. Expanding the determinant by the first row, we obtain 


\D| = o°(—1)""(—6q)"" [An — OpAs—u-1] 


u—2 u—l 


= 6g" "( — Me) O(N )dd- 


(8) that 


(9) 


(10) 


Hence from equations (9) and (10), we have 


6 D 6" u bot a yb-wtt a >> a >o-« 
(11) ¢(0 | u) aa q | | oes q [ 1 i a op 1 : 2 )] 
|I — 0Q| [A — re — Op(\i — r2)] 
From equation (11) we may draw the following conclusions: 
b—u+l b—u+l1 


fe" -¢ " - oe - 
Gi) fp | u)Nens nga 1 
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This is in agreement with the fact that eventual absorption is certain. 
(ii) Rewriting 
uu utl * —u ae oF —u+l —u 
o(0|u) = Og — Opry“) — Oa/m)’Qa“" — Oprd2“)] 


[((A — Op) — Qo/Ar)*QA2 — Op)’ 
since \; > Ae, 


(12) limps ¢(0|u) = O"g"Ar" = (A2/Op)”. 


The above expression is the generating function for the probabilities of absorp- 
tion when no reflecting barrier is present, and is identical to the result obtained 
by Feller [1]. 

(iii) The expected duration of time before absorption takes place may be 
obtained from 


E(T) = [0(¢(@ | u))/d}ou 


and is found to be 


p” 


(13) NO) = tenn ll — (@/p) )") if p¥q. 


When p = q = 3, lime.,: [0(¢(8 | u)) mi is evaluated using L’Hospital’s rule, 
and in this case 


(14) E(T) = u+ u(2b — u). 


3. Explicit expression for the probabilities of absorption. The form of equa- 
tion (6) indicates that ¢(@ | u) is simply a ratio of two polynomials in 9. Denote 
this by 


U(@) 


(15) o(0|u) = Fa - 


Both the numerator and the denominator have degree b. If the roots of V(@), 
6; , 2, ++ , ®& are distinct, equation (15) may be expanded into partial fractions 


b 
= --— Py 
(16) e(0|u) = 2 Gz 


where p, are constants that can be determined by 
_ —U(6,) 

1 | A concn a 

(17) » ~ [HV @)/a6]o—0, 


We first find the roots of the denominator, making use of the variable a defined 
by 
1 


(cos a) = 2(pq)'s. 


Then \y42 = (2cosa) [cosa + isina] = (2cos a) e*’, and in terms of the 
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new variable, ¢(@ | uw) may be written 


“¢@ sin (b + la — pi sin ba 


The denominator of (18) is found to have b distinct roots a, (v = 


which lie in the subintervals 
ve (w+1)r 
b—1’ bB—1 


(19) 6, = (2( pq)? cos a,)", 


bo: tt oe, ne 
(18) y(6|u) = (9/p)*” k sin (6 — u + 1l)a — p’sin (b ude 


The roots of V(@) are then 


From equation (17), we obtain 


> = —(q/p)*? a sin (6 — w + 1a, — pi sin (b — u)ou) 


§ cos (b , — bp! cos =) 
(20) [(b + 1)q’ cos (b + l)a p cos ba,] (= bas 


= —(q/p)"” [q’ sin (6 — u + l)a, — p sin (b — u)a,] sin a, 


2(pq)'[(b + 1)q' cos (b + 1)a, — bp! cos a,] cos? a, * 


It remains now to expand each term in equation (16) into a geometric series. 
The coefficient, g(t | uw), of 6° is found to be 





Py 
vel g— : 


this together with equations (19) and (20) yield finally 


g(t|u) = 


b 
g(t lu) = —2' ph girs o ea’ i 
vel 


(q’ sin (b—u+t 1)a, — p sin (b — u)a,] sin a, 
[(6 + 1)q@ cos (b + l)a, — bp! cos ba,| ; 
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ON THE QUEUEING PROCESS 1//G/1 


By C. R. Heatrucore 


Australian National University’ 


Introduction. The temporal development of the queueing process M/G/1 
(Markov or Poisson input, general service time distribution, one server) has been 
studied by several authors using various methods. Takacs [1] and others have 
centered their attention on distributions of waiting time; Gaver [2] utilized 
Kendall’s method of the imbedded Markov Chain. More recently, Keilson and 
Kooharian [3] used a method first developed systematically by Cox [4]. This 
last approach consists of restating the original non-Markovian process as a 
Markov one by the inclusion of supplementary variables in the definition of 
states of the system. For the process M/G/1 only one supplementary variable is 
required, namely the elapsed service time of the customer currently in service. 

The purpose of this note is to point out that both waiting time distributions 
and the distribution of queue length for the temporal process can be obtained 
in a simple way by the method of supplementary variables. We extend the re- 
sults of Keilson and Kooharian [3] to find the temporal analogue of the classical 
Pollaczek-Khinchine formula [5], and from this obtain the distribution of wait- 
ing time of a customer arriving at time ¢. To avoid repetition, reference is made 
to [3] for a complete description of the problem, and we use the same notation, 
as follows. 

Let the probability density of interarrival times be \e“'; W,(z, t), m = 0, 1, 
2, --- , be the joint probability of m customers waiting (servee is excluded) at 
time ¢ and the elapsed service time of the current customer z; F(t) the null 
probability; and D(x) the probability density of the service time. If n(x)éz 
is the first order probability that a service completion occurs in the interval 
(x, x + 6x), conditional on a customer having reached the “‘age’’ zx, then 


D(x) = n(x) exp \-[ n(u) dub. 


Define the generating function 


H(s, z,t) = exp 15 n(u) aut > 8" W(x, t) 
0 


7} m=( 


and assume that initially the system is empty, 
(1) E(0) = 1, W p(x, 0) = 0. 
Laplace transforms are denoted by lower case letters thus, 
e(p) = S{E(t)] = [ e*E(t) dt Rep>o. 
0 
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Pollaczek-Khinchine Formula. It is shown in [3], equation (3.15), that 
H(s, x,t) = exp{—A(1 — s)z}Ho(s, t — x) 
and that for initial conditions (1) the following holds ([3], equation (3.21) ) 


on) — (P+A— ds)e(p) — 1 
(2) ho(s, p) = dip +r— vs) — > 


From these results we seek the generating function of the distribution of 
queue length, 
¥(s,t) = E(t) + >> s"P,,(t), 
m=1 


where P,,(t) = f 5 Wm(2, t) dx is the probability of m customers irrespective 
of the elapsed service time x. Note that x S ¢ because of the initial conditions 
chosen. The generating function 


K(s, x,t) = E(t) + >> s"*'W,,(z, t) 


m=( 
is, from (2), 
K(s, x,t) = E(t) + sHo(s, t — x) exp 7x —s)x — I n(u) aut 
0 ] 


Using the relation 


exp }— [ a(u) aut =1—- [ D(u) du 


we find the Laplace transform of ¥(s, t) as follows: 


oo 2 t 
¥*(s, p) = I e 'y(s, t) dt = I ae | K(s, x, t) d= 


(3) 
— (P+ — ds)(1 — s)e(p) + s{1 — d(p + d— ds)} 


(p + ¥ — ds) {1 — sd(p + dX — ds)} 
The generating function of the equilibrium distribution, which exists for 


rf xD(2x) dz, 
0 


is obtained from (3) by standard Tauberian arguments; 


(1 —s)E 


(4) lim py*(s, p) = lim ¥(s, t) = —— sd-"(y — ds)” 


E is the equilibrium null probability 


t+o 


E = lim E(t) = 1 — | xD(z) dz. 
0 
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Equation (4) is the Pollaczek-Khinchine formula for the distribution of queue 
length, and (3) is its temporal analogue for initial conditions (1). 

The only unknown occurring in (3) is e(p), the Laplace transform of the null 
probability. An explicit expression for e(p), at least in principle, can be obtained 
by the following standard argument. Since ¥*(s, p) is a generating function it 
converges for at least |s| < 1, so that within the unit circle zeros in s of numera- 
tor and denominator coincide. By Rouché’s Theorem the only zero of the de- 
nominator within the unit circle is the smallest zero of the equation 


d(p+A-—As) =s. 
If this zero is s = é(p), then 
(5) e(p) = [p+ — r&(p)]. 


The Lagrange Inversion formulae ([6], page 132) can now be applied to ob- 
tain e(p) and hence the null probability £~[e(p)]. This procedure is often too 
complicated to carry out, but in many cases of practical importance, for example 
with x’ type service distributions, an explicit expression for E(t) can be found. 


Distribution of Waiting Times. Let n(t) be the waiting time of a customer 
arriving at the instant ¢. If the system is empty at ¢ then n(t) = 0. Otherwise 
n(t) is the sum of the service times of those customers already waiting and the 
remaining service time of the current customer. Cox [4] previously used this 
argument to obtain the waiting time distribution in the equilibrium state. If 


F(z, t) = Pr{n(t) s 2} 
and 
(a, t) = [ e™ dF(t, x), 
0 
then, since the service times are distributed independently, 
é , = W,,(u, t)[d(a)|" f° 
(a, 1) = E(t) + [ dud us ida} 
: ¥ ere -| D(y) dy *° 
(6) 0 


t 2 
E(t) + [ duH,(d(a),t — uerieernie | e “’D(u + v) do. 
0 0 


e “’D(u + v) dv 


Let 0(p, a) = foe ’'b(a, t) dt. Using elementary properties of the Laplace 
transform we find that 


1 


6(p,a) = fae(p) —1l}{a — p— A(1 — d(a))}~ 


and hence that 


t 
(a, t) ie ett I-d(e)) + ad a | Grrr Bie) ar| ; 
0 
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This result has been obtained previously by Takdécs ({1], Theorem 2) using a 
different method. 
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THE SEQUENTIAL DESIGN OF EXPERIMENTS FOR 
INFINITELY MANY STATES OF NATURE! 


By Arruur E. ALBERT 
Columbia University’ 


0. Summary. In [2] and [3], H. Chernoff discussed the Sequential Design of 
Experiments. In [2], a procedure was exhibited and was proved to be asymptoti- 
cally optimal for the hypothesis testing problem when there are finitely many 
states of nature. This paper extends Chernoff’s results to infinitely many states 
of nature. 


1. Introduction. As a rule, when a scientist performs an experiment in order 
to obtain information about a certain phenomenon, the outcome of the experi- 
ment not only serves to cast light on the problem at hand, but also aids the 
experimenter in designing a more informative experiment. As more and more 
data is accumulated, his experiments can be made more and more informative 
until he reaches a point where he feels that further experiments are unnecessary. 
He then announces his results. 

In [2] and [3], Chernoff dealt with this procedure (which he called the “‘Se- 
quential Design of Experiments’’) and in [2], he proposed a sequential procedure 
which applies to the two action (i.e., hypothesis testing) problem when there 
are finitely many states of nature. It was shown that the risk under this pro- 
cedure is approximately —c log c/J(@) when the cost ¢ per experiment is very 
small (where /(@) is an appropriately defined information number). It was 
also shown that in order for another procedure to do appreciably better for 
some value of the parameter (state of nature) 6, it must do worse by an order 
of magnitude for some other value of the parameter (as c tends to zero). 

Chernoff’s procedure can be partially described by saying that at each stage, 
the experimenter continues experimenting so long as the likelihood ratio is less 
than 1/c. If another experiment is to be performed, the experimenter chooses the 
experiment as though he believed that the current value of the maximum likeli- 
hood estimate (m.l.e.) were the true value of the parameter. If the likelihood 
ratio is so large that no more observations are required, the experimenter ac- 
cepts the hypothesis corresponding to the current value of the m.lLe. (It has 
been pointed out by one of the referees, that the idea of estimating the true 
situation by the m.l.e. and then using this estimate to decide what future course 
of action to take, seems to date back to A. Wald’s work on sequential estimation 
in [9].) 


Received June 9, 1960; revised January 11, 1961. 
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We shall deal with the extension of Chernoff’s procedure and results to the 
case where the possible states of nature are infinite in number. A class of pro- 
cedures will be exhibited which possesses the property that for any positive 
number e, there is a member of this class for which the risk is no larger than 
—(1 + € + o(1))c log c/I(@) as c tends to zero. 

The following example will serve as a prototype for the sequential design of 
experiments problem as applied to hypothesis testing: 

Two random variables are independent and normally distributed with means 
m, and m, respectively, and unit variance. It is desired to test H,:m, = m: vs. 
Hz:m, < m:. The cost of making the wrong decision (hereafter called the ‘‘re- 
gret’’) is a function of the distance from the true parameter 6 = (m,, m2) to 
the boundary line {6’:6’ = (mi, m2), m; = m3}. Two experiments are available. 
These are e, : Observe the first random variable, and e. : Observe the second 
random variable. After each experiment, the statistician must decide whether 
to perform another (independent) experiment or to stop. If he continues, he 
must decide which experiment to perform next. If he stops, he must decide 
whether to accept H, or H2. 


2. The Relevance of Kullback-Leibler (K.L.) Information Numbers. In [2] 
and [3], extensive heuristic arguments were set forth to motivate the use of K.L. 
information numbers in the sequential design problem. (See [7] for a wider realm 
of application.) Chernoff’s arguments can be briefly summarized as follows: 

Suppose an experiment is repeated many times, yielding independent observa- 
tions Y;, Ye, --:, Yn, -::. Let H, be the hypothesis that the observations 
have a density fi(2) and let H, be the hypothesis that the observations have a 
density fo(2). The Bayes strategies are the Wald sequential likelihood ratio tests. 

A sequential likelihood-ratio test is characterized by two numbers A and B, 
(A > B): After the nth observation, continue sampling if 


B < D> log (fi(¥;)/fe( Y;)] < A. 
j=l 


Stop sampling and accept H, if 


2 log [fil Y; )/fo( Y;)] 
j= 


Stop sampling and accept H; if 


DL log (fu ¥,)/fl ¥5)] S B. 


The appropriate numbers A and B are determined by the a priori probabilities 
and the costs. However, when c is very small, compared to the regret, it turns 
out that A is approximately equal to —log c and B is approximately equal to 
log c. 

Denote the probability of error (when H; is true) by a;(i = 1,2) and the 
expected sample size (when H; is true) by N;(i = 1, 2). In [10], Wald showed 
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y r —s B 
that for small c, N; ~ —B/I,, No A/Is, a & e * and az © e”, where 


in [02 [fily)/flw fly) dy 


I, = [ioe [fely)/fily)lfely) dy. 


(The quantities /;(i = 1, 2) have subsequently come to be known as Kull- 
back-Leibler information numbers. ) 

If the regret for making an incorrect decision (when H;, is true) isr;(t = 1, 2) 
then the average regret (or risk) under H; can be approximated by 


Ry = eNy + ries & [(—c log c)/T\] 


when c is small compared to r;(i = 1, 2). (See [2]). 

Suppose that a design element is introduced: Assume that two equally costly 
experiments e; and é2 are available for testing H, against H, . If the experimenter 
chooses e; , performs it exclusively, and proceeds in an optimal fashion, his risk 
under H; will be approximately inversely proportional to /;(e;) when c is small. 
Hence, if [;(e:) > 1;(e2) and I2(e,:) > J2(e2), it obviously behooves the statis- 
tician to select e . 

However, if J;(e,;) > Ji(e2) and I2(e,) < Je(e2), e is better than es if H, is 
true, but é2 is better than e, if He is true. 

If the cost per experiment is smali compared to the cost of making an incor- 
rect decision, the experimenter may find it expedient to perform an additional 
experiment, even though he is virtually convinced that H, (for instance) is the 
true hypothesis. In this case (J;(e,) > J;(e2)) it would seem that he would be 
wisest to choose e; . 

Owing to the uncertainty about the true state of nature, the statistician is 
bound to make mistakes at the early stages of experimentation, but if the prob- 
ability laws are such that the true hypothesis becomes more and more evident 
as data accumulates, the small cost of experimentation will make initial mistakes 
in choosing experiments relatively unimportant, and eventually the statistician 
will begin performing the most advantageous experiment and stick to it until he 
decides to make his terminal decision. 

If the hypotheses are composite and if a finite number of experiments are 
available to the experimenter, considerations of the sort mentioned above sug- 
gest that if the experimenter is almost positive that @ is the true state of nature 
(say, @¢H,), he should choose his next experiment so as to maximize 
inf yen, 1(0, ¢, e), where 


I(6,¢,e) = [roe (fly, 4, e)/fly, ¢, fly, 8, e) dy, 


and f(y, 6, e) is the density of the random variable observed under the experi- 
ment e. 
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The appearance of an expression of the form max, min, / (4 , g, e) immediately 
calls to mind a resemblance to similar-looking expressions which occur in the 
theory of games. By interpreting /(6,-,-) as a payoff matrix, we recall that it 
is sometimes possible to do a better job of maximizing min, /(@,-¢, e) with 
respect to e if we utilize randomized strategies. 

A randomized experiment can easily be interpreted when the collection of 
available experiments is finite (or countable). If the statistician consults a table 
of random numbers chooses experiment e with probability \{e}( >>. Me} = 1), 
and then performs experiment e, this process constitutes a randomized experi- 
ment which can be denoted by X. It will be shown that a Kullback-Leibler in- 


formation number for the randomized experiment \ can be consistently defined 
by 


1(6,¢,) = 20 1(6, @, e)Afe}. 


3. General formulation. We now extend the notions of the previous section 
to the case where the parameter space is not finite. 

Suppose a statistician is contemplating two courses of action in connection 
with a problem of inference. The true state of nature is unknown to the statisti- 
cian, but corresponds to a point in an abstract space 0. Denoting the 


two (terminal) actions by a; and a, we assume that 9 can be partitioned into 
three sets: 


8 = @) U 0; U OQ. 


If the true state of nature is in @, either action is acceptable, but if the true 
s.o.n. lies in @;:, then a; is preferred (¢ = 1, 2). If @¢€ 6,U © is the true s.o.n. 
and the non-preferred action is taken, the regret is given by r(@) > 0. We can 
extend the domain of definition of r to 6 by setting r(@) = O for @¢ @). 

The statistician has at his disposal a finite set of (pure) experiments 


& = {e,,@2,°°* , em}. 


(From now on, e with or without subscripts, will denote a generic element of &.) 

By performing a sequence of experiments, the statistician hopes to amass 
enough data to make an intelligent guess (or terminal decision) as to whether 
the true s.o.n. @ lies in 0, or in ©, , and then will take action a; or a2 accordingly. 
(He is not concerned if @ ¢ 6» , for then, either action is acceptable to him.) 

If experiment e is performed, the random variable Y,, which takes its values 
in a measure space (Y,, ue), is observed. It is assumed that Y, has a density 
with respect to (w.r.t.)u. for each @ ¢ ©. Hereafter, we shall denote this density 
fly, 9, e). 

If the n + Ist experiment e“"*” is chosen according to any measurable rule 
(ie, 7? =e (Y™, Y®,---,¥) isa measurable function of the previous n 
observations Y’, Y®, --- , Y"), the outcome of this experiment (once e‘"*” 
is specified) is assumed independent of the n previous outcomes. No matter 
which experiment is chosen, we assume a sampling cost of c units per observation. 


+1) 
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When a randomized experiment \ is performed, two random variables (r.v.’s) 
are, in effect, being observed. First, the statistician observes the value of the r.v. 
E, whose probabalistic behavior is governed by the relation 


P{E = e] = Me}. 


After observing E, he observes Y,. The probabalistic behavior of Ys (given 
that E = e) is, of course dependent upon the true state of nature 6. But aside 
from that, it is known that for every (u.) measureable subset B of Y,, 


PAY. ¢ B) = [ s, 6, e) du-(y). 


It will be convenient for us to have a notation for dealing with r.v.’s which 
are associated with a random experiment A: When the (randomized) experiment 
d is performed, the statistician is actually observing the values of the random 
variable 


X, = (E, Yz) 
which takes its values in the space 
X = {a:2 = (e,y),c€€&6, ye Yd. 
If S is a subset of X, we define the projection of S on Y, by 
8. = {y: (e, y) € S}. 


We define a set function » on all sets S C & having the property that S, is 
(ue) measureable for all e, by 


u(S) =—_ > ue(S.). 


It is easy to see that the domain of u is a o-algebra of sets and that yu is a 
measure on X. Since it has been assumed that the (pure) experiments are such 
that Y‘"*” is independent of the past given e‘"*”, it follows that the outcome, 
X'"*? = Xy\cn+1) , of the (n + 1)st randomized experiment is also conditionally 
independent of the previous n observations, once \‘"*” is specified. 

It is clear that X, has a density (w.r.t. uw) over X: 
f(x, 0,r) = fly, 8 e)Me} if x= (e,y) and yey. 

We define the K.L. information number for the experiment \ by 

(ax, 6, X) 
1(0,9,r) = [log L2®% s(2, 6, x) du(2). 
f(x, ¢, ») 
It is plain then, that 


(0, ¢,) = 22 1(6, g, e)Afel. 


4. An “optimal” class of procedures. We now define the class of sequential 
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procedures with which we shall deal. As is the custom, we shall define our pro- 
cedure(s) by giving the stopping rule, the terminal decision rule and in addition, 
we shall prescribe a rule for choosing the next (possibly randomized) experiment 
if the data and stopping rule allow an additional sample. 


Given the data from the first n (possibly randomized) experiments X“”, 
x®,.-. |X, we define L,,(@) to be the log of the likelihood: 


(4.1) L,(0) = >. log f(x”, 0, \), 


j=l 


where 


xX? = X,. 
Let 
(4.2) Lin = SUpse 0; Ln( 8’) 
The generalized log of the likelihood ratio can be defined by 
(4.3) Ly, = max {Lin — Leon, Lon — Lin}. 


If, for each 6 ¢ 6,U ©, , we define a(@) to be the hypothesis alternative to the 
one containing 6: 


(4.4) a(@) = (0,U &) —O; if 0€9;, 


then it is easily verified that 


6'e 6,U82 


n >) of (p) 
(4.5) L, = sup | inf 2 lon Sate a |: 


ore ace’) jar = f(X, 0”, A) 
For any p(0 < p S 1), we say that 6, is a p-pseudo maximum likelihood esti- 
mate (p-p.m.le.) over 86 = 6,U 6,U ®& if: 


(1) 6, = 6,(xX%, X®,---, X™) is a function of the first n observation 
yO FP. Oe 


(2) I] f(x, 6. ,) = po sup [J f(x, &, x”). 
6 


j=l £0 j=l 

For any p(0 < p < 1), 6, will always exist, and we tacitly assume that a 
measureable version of 6, is available. If 6, exists for p = 1, it corresponds to the 
usual m.l.e. Throughout the remainder of this paper, we assume p to be fixed 
and less than unity. 

We define A to be the set of probability distributions over & and A, to be the 
set of probability distributions over & which assign at least probability 
v(0 S y S 1/M) to each element of &. 

Given @6¢0,U ©, and y(0 < y S 1/M), define \’, to be that element of A, 
which maximizes inf 4.a:9) [7( 8, ¢, \)]. (Such an element exists by Theorem 2.4.2. 
of [1], since A, is convex and has a finite set of extreme points.) 

Given 7:(0 S y: S 1/M), yo(v2 2 0) and c(0 < ¢ < 1), we define procedure 
A(y1, Y2) a8 follows: 
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(1) On the first trial, perform any randomized experiment from A,, . 
(2) After the nth trial, if L, 2 —(1 + v2) loge, stop sampling and accept 
the hypothesis corresponding to 


0; if La = Lin or Lon 
Oe if L. = Lon sa Lin 


(3) If, after the nth observation, L, < —(l1 + vy.) log c, compute the 
p-p.m.l.e. 6, , and perform the (n + 1)st randomized experiment with 


rire) — (rz? if 6, €0,U 
if 6, € Oo. 


Each member of the class depends upon four parameters: 7i(0 S 7 S 1/M), 
where M is the number of pure experiments available to the statistician), 


yelv2 2 0), p10 <p 1), and c(0 <e¢ <1). 


Throughout the course of our discussion, p will remain fixed and we shall investi- 
gate the risk associated with the procedure as c approaches zero. For these rea- 
sons, we suppress c and p when we talk about a typical procedure “A(y; , y2)”’ 
from this class. 

It should be pointed out that the procedure proposed by Chernoff in [2] and 
[3] for the case when 6 is finite, corresponds to A(0, 0) with p = 1. 


5. The main theorems. The most important result in this investigation is 
obtained via five theorems. Theorem 1 establishes the (strong) consistency of 
the p-p.m.Le. 6, . The method of proof is derived from a technique employed by 
Wald in [8] to establish the consistency of the m.l.e. However, it was found that 
Wald’s technique was not general enough to cope with the random variables 
arising from randomized experiments in the simple normal prototype example 
mentioned in Section 1. 

The overly restrictive nature of Wald’s assumptions were eventually recog- 
nized, and in [5], Kiefer and Wolfowitz were able to demonstrate the consistency 
of the m.l.e. under a substantial relaxation of Wald’s conditions. 

The assumptions utilized in the present work in order to establish the con- 
sistency of 6, bear a striking resemblance to the Kiefer-Wolfowitz conditions 
(although the present work was done independently) and consequently, the 
reader is referred to [5] for a full motivation for assumptions Al-A7. Assumptions 
A8-a and AQ represent additional conditions governing the rate of convergence 
of 6, . 

Theorem 2 establishes a bound on the expected sample size under a typical 
procedure ‘“‘A(y;, y2).’? Assumptions B1-B6 relate most directly to Theorem 2 
and are used primarily in showing that this bound holds uniformly over large 
subsets of the parameter space. 

iheorem 3 exhibits an upper bound on the probability of error under A(¥:, 72) 
and depends heavily on A8-b, and Theorem 4 merely combines Theorems 2 and 3, 
yielding a bound on the risk under A(y; , y2). 
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Theorem 5 follows from Cl and some known theorems about convex sets. It 
establishes the sense in which the proposed class of procedures is optimal. 


6. The assumptions. The set of assumptions Al-A7 are the generalization of 
Wald’s assumptions. A8-a and A9 permit us to analyze the rate of convergence 
of 6, . A8-b (which is a strengthening of A8-a) is used in establishing bounds on 
the probability of error under A(y; , y2). 

Al: The space of (pure) experiments is a finite set consisting of M elements: 


& = {e1,@2,°-+, em}. 


Associated with each e ¢ & is a random variable (r.v.) Y,, which takes its values 


in a measure space (‘Y, , u.). ue is a Measure on Y, , and Y, has a density f(y, 0, e) 
with respect to u, for each 6 € 0. 


Ag: [Wo fu, 0, €)| fu, 0) ducly) < = 
Ve 


for all 0 ¢ @ and alle « &. 


Hereafter, we will denote the expectation of a Borel function G of Y, by 
EG(Y.): 


E.G(¥.) = I G(y)fly, 0, €) due(y) 


A3: We assume that © can be embedded in a compact topological space 
(6*, 3*) where (6*, 5*) is 71, satisfies the first axiom of countability and 
6 C 0*. (A topological space (0*, 3*) is 7; if, for every pair of points ¢, ¢’ ¢ O*, 
there is a set in 3*, which contains ¢ but not ¢’. The space satisfies the first 
countability axiom if there is a countable basis at each point. See [6] for a full 
discussion of these properties. ) 

We further assume that the domain of definition of f(y, @, e) can be extended 
from 6 to 6* in such a way that 

A4: (a). For each e ¢ &, ¢ ¢ 0*, f(y, ¢, e) 2 0 (a.e.u,) and 


[ fly, @,e) duly) S 1. 


(b). If 6¢ 6, ¢ e O* and ¢ ¥ 8, then 


| fly, 9,e) duly) > 0 forsome ee 6&. 
(f(y ,0,e)4f(y,¢.e)] 


A5: If ¢, — ¢ (in 3*), then for each e ¢ &, there is a set D = D(e, ¢) CY, 
(which does not depend upon the sequence {¢,}), for which 


[ s, 6,e) du(y) =0 forall oc0 
D 
and for which 


lim SUPi+e f(y, Pi, e) = ty, Y, e) 


whenever y ¢ D (upper semi-continuity ). 
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DEFINITION. w(y, U, e) = sup,eu f(y, ¢, e) for each U e 3*. 

A6: For each U ¢ 3*, w(y, U, e) is a (u.) measureable function of y. 

AT: For each @ ¢ © and each ¢ ¢ 0*, (¢ ¥ 6), there is a set V = V(0,¢) € 3* 
containing ¢, for which 


E, log” w(Y., V,e) < @ forallee6&. 


(For any function h, h* = max (h, 0).) 

Let 3 be the relativization of * to 6:3 = {U: U = VN06,VeS}. Disa 
topology on @ (see [6]). 

A8: (a). Given 6 ¢ G@andg « O* (6 ¥ ¢), there is a positive number t¢ = (6, ¢), 
a set V = V(¢, @) ¢€ 3* containing ¢, and a set Q = Q(6, ¢) € 3 containing 6, 
for which 


Ew [w(Y., V, e)/f( Ye, 9, e)|' < @ for alle e&. 


and all @’ <Q. 
(b). Given 6 € 0, g € O* (oy ¥ 6) andy > 0, there isa set V = V(¢, 6,7) € 3, 
containing ¢g, and a set Q = Q(6, y, y) € 3 containing 6, for which 


Ew [w(Y., V, e)/f(¥., 0, e)]""” < @ for alle ¢&, 


and all @’ e Q. 

A9: If Ey [w(Y., V, e)/f(Y, 0’, e)]‘ exists and is finite for some t(0 < ¢ < 1), 
whenever 6’ is in some set Q ¢ 5, then Ey [w(Y., V,e)/f( Ye, 0’, e)]‘ is upper-semi 
continuous in 6’ (w.r.t. 3) over Q. 

(Let (Q, 8) be a topological space and let g be a real valued function on Q. 
The following statements are equivalent: 

(a). g is upper-semi-continuous over 2 (in §). 

(b). For any real k, the set {w: g(w) = k} is closed (in S$). 

(c). If w; — w(in $), then lim sup;.. g(w;) S g(w). 

(d). If w; — w(in 8), then for any « > 0, there is an n, such that g(w,;) S 
g(w) + efor alli 2 n. 

An upper semi-continuous function achieves its maximum over any (8) com- 
pact subset of 2.) 

The derivation of a bound on the expected sample size requires an additional 
terminology which we now develop: 

(a). For any y(0 S y S 1/M), A, is the collection of probability distribu- 
tions over & which assign at least probability y to each element of &. A* is the 
(finite) set of extreme points of A, : 

* 


A, = ‘Any » Asay cee: he Amy 


where: 


ee if tj 
ries) ear fi aj. 


Ao = A = the set of all probability distributions over &. 
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(b). For @¢0,U @,, 
a(@) = (0,U ®&) — 0; if 6€80; 
and 
h(0) = ©; if 6€0; 
(c). Ife e O*, ye VY, and x = (e, y) we define 
f(x, @, %) = f(y, o, e) Me. 


This is just an extension of the domain of definition of f(z, 6, ) (as defined in 
Section 3) from © to 0*. 
(d). Ifee& @eOandge 0%, 


I(6, Y; e) eo Ey log Lf( Y. ’ 6, e)/f( Y. ’ Y; e)], 


(1(6, ¢, e) may be + ~), andifXe A, Oe Gand ge O*, 
I(0, ¢,) = Ey log [f(Xx, 0, )/f(Xx, 9, )] = Do Mel (8, ¢, e). 


(e). If 8€ O,U Os, Ag is that element of A, for which 
inf, - a) 1(0, y, Xe) = maxy a, [inf, a 1(0, ¢, d)] 
and 
I(6,y) = maxy a, infy -a@ I(8, ¢, A). 


(f). [(@) = I(6, 0) 
(g). w(a, V, A) = sup,. v f(z, ¢, A) for all V e &*. 
(h). For 6¢ @ and V ¢ 3* and dX € A we define 


(0, V,») = Ey log [f(X , 0, A) /w(X, , V, A)]. 
(1(0, V,\) may be = ~~.) 


We now state assumptions B1-B6: 

Bi: For each y(0 S y S 1/M), I(6, y) is continuous (w.r.t. 3) over 0,U ©. 

B2: If @e 0,U &,0 < y < 1/M and {¢;} is a sequence in 6* converging to 
¢ (in 3*), then lim,... /(@, ¢:, \¢) = (6, ¢, \¢) (The limit may be + ~). 

BS: If 6¢ @ and 1(6, V,e) < ~, then [(6’, V, e) is continuous in some (3) 
neighborhood of @. 

BA: If 62 0,0 < y < 1/M and 1(0@, V, e) < @ for alle, then [(6’, V, 3’) 
is continuous in some (3) neighborhood of @. 

B6: J(@) > O for all@ec0,U ©. 

B6: 6; and ®@, are in 3. 

In order to establish the desired optimality property, we require 

C1: If 06, ge GO and 1(6, ¢, e) < ~, then 


E,{log [f(Y-, 0, €)/f(Ye,¢, e)}* < @. 
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7. Consistency of the p-p.m.Le. 6,. Before attempting to establish the con- 
sistency of 6, , we need to investigate the underlying structure associated with 
the problem as we have formulated it. 

Lemma 1. Jf 0¢ 0, g ¢ O* and dX A, then I(0, gy, X) = O with equality if and 
only if f(x, 0, %) = f(x, ¢, X) on a set of Py probability measure one. 

Proor. For any r.v. Z, exp E Z = E exp Z with equality if and only if Z is 
constant with probability one (Jensen’s inequality ). The conclusion follows with 
Z = log [f(2, ¢, \)/f(2, 8, X)), by applying A4-a. 

Lemma 2. Given @ € 0, ¢ € O*(¢ ¥ 6), there is a decreasing sequence of sets | V,,} 
such that V,, € 3* for every n, T\r-uVn = {¢}, and 


lim Ep log w(Y., Vz, e) = Evlogf( Ye, ¢, e). 
n~>2o 
Proor. Let U, be a countable basis for 3* at ¢ and let V, = —}j-iU;. Then 
{V,} is a decreasing sequence of sets in 3* and ¢ lies in every V, . Since (0*, 3*) 
sa T; space, we have in fact, f}%-1V, = {¢}. By A7, there is an mo such that 
Eslogw(Y., Vr, e) S Eslogw(Y., Vn,,e) < © 

for all e ¢ & and all n = mo. The conclusion follows by A5 and the monotone 
convergence theorem. 


Lemma 3. Given y(0 < y < 1/M), 6¢0 and ge 0 (¢ ¥ 8), there is a set 
V = Vig, 0, y) € 3*, containing ¢, and a constant B = B(0, ¢, y) < 0, such that 
w(X,, V, A) 


E\ <8 forall AeA,. 


oO nn —— 
SFX, 8, 2) 
Proor. By A4-b and Lemma 1 and 2, 


lim E, log w(Y., Vn, e) = Evlogf(Y.,¢,¢e) S& Eslogf(Y., 0, e) 
with strict inequality for at least one e ¢ &. Since Ey \log f(Y., 0, e)| < © (by 
A2), it follows that 
lim Es log w(X,, Va, A) < Ee log f(Xy, 8, dr) 
for all \ ¢ A® . Since A* spans A, , the conclusion follows. 
It should be observed that Lemma 3 is not, in general, true for y = 0, for if 
1(6,¢, e') = Oand X places unit probability on e’, then for all V ¢ 3* containing ¢, 


Ee log w(X,, V, A) 2 Ee log f(Xy, 4, d). 


This situation actually occurs in the prototype example of Section 1. If @ = 
(m,, m2) and g = (m,, m2), then I(6, ¢, e,) = 0. 
The following lemma permits us to apply the assumptions concerning f(y, @, e) 
and /(@, ¢, e) directly to f(x, 6, \) and I(@, ¢, \). The proof is left to the reader. 
Lemma 4. A2, A4, A5, A6, A7, A8 and A9 remain true if & is replaced by A, 
e by A, Y. by Xy, y by x, Y. by X and yp, by uw (see Section 3) throughout. 
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The pans inequality will be used in investigating the rate of convergence 
of 6, 


Lau 5. For any r.v. Z, P[Z = 0) Ss E(e'’) for allt > 0. 
Proor. If E(e'”) = o, then ieiiiailty is trivial. Otherwise, 


E(e'”) = E(t'”) |Z = 0)P{Z = 0}. 


Since E(e’ | Z = 0) = 1 whent > 0, the conclusion follows. 

In order to establish the consistency of 6, , we shall show that for any set 
S ¢ 3 containing the true parameter 6, the probability (under @) that 6, ¢ S for 
all n sufficiently large, is unity. 

Dertnition. For any set S C 5, we define 7's to be the smallest integer m, such 
that 6, ¢ S for all m = m if such an integer exists; if no such integer exists, we 
define Ts = + ~. 

We now derive a bound on 


P, (6.  S for some m = nl. 


THEOREM 1. Let y, (0 < y < 1/M), Se Sand @€S be given. If experiments 
\” are chosen from A. according to any measurable procedure (i.e., such that \‘” 
is a measurable function of the previous (j — 1) observations), then there are finite 
positive constants k and b and a (3) neighborhood Q of @ for which 


Py|Ts > m) S k exp (—bm) for all & ¢ Q. 


(k, b and Q depend upon @, y and S.) 
OUTLINE OF PROOF. 


Py (Ts > m) S D Pe (6, ¢ S). 
nom 


If 6, ¢ S, then since S = S*/N 0, (where S* ¢ 3*), it follows that 6, 2 S*. Hence, 
by definition of 6, , 


. 4“ KX _0,4%) 7 
a S = 
Prlé.2 5) s Pe| sup e log f(X9, x) 1) = = log p 


for all 6’ « S. 
By virtue of the (3*) compactness of 6* — S*, Lemma 3, A8-a, A9 and the 
convexity of moment generating functions, we can choose a finite collection of 


sets {Vi, Ve, ---, Vp} C 3, a set Qe 35, containing @ and positive constants 
b and t, such that 


max | max Ee {ex t lo wm Th 
iets p he Ay i = ° F(X, A) a 


. if w(X”, V;, x) 
>, sl < ak At 
Py (6, 2S] s 2 Ps P log HK9,#, xo) = log p 


for all 6’ ¢ Q. (p, V;, t, band Q depend upon S, y and @.) We then apply Lemma 5 
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and obtain 


v 
Po» (6, zg S] < a p ¢, bn 
t=1 


for all 6’ ¢ Q from which the conclusion follows. 
Coro.tuary. For any @ ¢ 8, Pel6, — 6] = 1. (The convergence is relative to 3.) 
Proor. 6, — 6 if and only if T's; < © for every set S ¢ 5 containing 6. 


8. Bounds on the expected sample size under procedure A(y:, y2). In this 
section, bounds on the expected sample size are derived. Lemmas 6, 7, and 8 
are building blocks and Lemma 9 is the keystone of the main result of this sec- 
tion. The proof of Lemma 6 follows that of Lemma 2 and is left to the reader. 

Lemma 6. If 0 € 0, g € O* and I(6, g, e) = @~, then for any constant C > 0, 
there is a set V = V(¢, 0, e, C) € 3* containing ¢ for which I(6, V, e) > C. 

Derinition. For 6 ¢ 6, U @2, let a(@) be the (3*) closure of a(@). 


Lema 7. If 062 0,U @.,0 <y <1/M andg ¢« a(@) then 1(0,¢, 46) = 1(0,y). 


Proor. If [(6, ¢, \¢) is infinite, the assertion is trivial. Otherwise, there is a 
sequence {y,;} C a(@) converging to yg. Let 6 > 0 be given. By assumption B2 
we may choose 7 so that 1(6, ¢;, \¢) S I(0, ¢, #) + 4. Since 


gi €a(6),1(0,¢;, 6) 2 1(8, y), 


and since 6 is arbitrary, the conclusion follows. 
Lemma 8. If 6¢0,U ©, and0 Ss y < 1/M,1(6,y) > 0. 
Proor. By B5, 1(6,0) = I(@) = inf, . a) (6, ¢, 6) > O. Let 


Me} = (1 — My)dole} + v. 


Then, \ ¢ A, and Asie} = [Ae} — y]/[1 — My]. Since inf, a) 1(0, ¢, 4s) > 0, it 
follows that for some 6 > 0, inf, .a@ [I(@, ¢, \) — Yd I(0, ¢, e)| > 6. Thus, 
maxy. a, infy.a) 1(0, ¢, 4) = infyeaw 1(8,¢, 4) > 6 > O. 

In Theorem 2, we show that the expected sample size under A(y , y2) is not 
much larger than —(1 + y2) log c/I(@, v1) when @ is the state of nature and c is 
small. To do so, we will show in Lemma 9 that Ps[N > n] (where N is the sample 
size) declines rapidly (in fact exponentially) when n is significantly larger than 
—(1 + yz) log c/I(0, y,). The proof of Lemma 9 is complicated, so the general 
idea is sketched roughly below to help the reader see the forest through the trees: 

The event [N > n] is (by definition of the stopping rule) contained in the event 


, n f(x? 6’ mel | 
su inf lo eet aD Me mh S log ¢ 
| Up {, int, 2d 8 f(X®, 9, ) ( + 12) g 
and this event is, in turn, contained in the event 
; n ax 6’ el 
: flog a ep < —(1 + v2) loge |. 
tb, ¢ bp 2 °6 f(X9, , n) (1+ Y2) log 


If 6 is the true s.o.n., then by appropriately choosing a finite number of (5*) 
neighborhoods V,, --- , V, so as to cover the (3*) compact set a(@), we will be 
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able to assert that for all 6’ ¢ h(@) 
S r n X® gf) 
Po \N < Py ] HX) 3 . 
7IN>n] s 2d 8 [> og w(XO_V.,x0) < —(1+ 7:2) loge], 
or more conveniently, 
aie : - oA XV; ,2+%) 
Po |N > n| < 2d Po | log F(X, 6, xD) > (1 + ¥2) log c . 
If n is larger than —(1 + 6)(1 + ye) logc/I(6’, 71), then 
awe .., [<x w(X”, V;,r”) 
The summand can be decomposed into three terms: 
w(X"”, vy, &) 


] 7 rr ~ — 
7 f(X®, 0, A) 


+ 1(¢, ¥1)/1 + § = Ai + Ajj + A3; ; 
where 


OP ag FR ak Re ree : 
“F(X®, 6, x@) + I(¢, V a> r ) 61 (6, y,)/2(1 + 5), 
Ay = —1(0,V;i,") + 10, Vi, Xe"), 

Az; = —1(6',V;, do") + 1(@, 1) (1 — 6/2(1 4+ 4)). 


Since 


A;; = log 


att. ¥..0°) 
HG, 0, XY’ 
A; has negative mean and hence, >_}.; Ai; tends to be negative. The particular 
choice of \°” (from A,,) will insure that A2; grows very slowly when n is large. 
The neighborhoods V, will be taken so small that A;; will be approximately 
— 81(0’, y:)/2(1 + 4). Hence, 5°37.) (Ai; + As; + As;) tends to be negative, 
and exceeds zero with small probability. 

Derrinition. Let N be the sample size required to reach a terminal decision 
under A(¥y1, Y2). 

Lemma 9. Let 6 be a point in 0,U ©, and let (0 < 6 < 1),y1(0 <1 < 1/M), 


and y2(y2 > 0) be given. Then there are finite positive constants b and k and a (3) 
neighborhood Q of 6 such that for 0’ e¢ Q and 


I(@’,V;,X°’) = —Ew log 


n> —(1 + &)(1 + 72) loge/I(@’, 1), Pe [N > n] S ke" 


under A(y1 , Y2). (k, b and Q depend upon 7, , 6 and 6.) 

Proor. For notational convenience, Q with or without subscripts will denote 
a generic (3) neighborhood of @ throughout the following discussion. If a result 
holds in a (3) neighborhood of @ and a second result holds in a second (3) neigh- 
borhood of @, then the results are simultaneously true in the intersection of these 
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neighborhoods which is itself a (non-vacuous) (3) neighborhood of @, and hence, 
no ambiguity can arise, provided that we only require a finite number of state- 
ments (each of which is true in a neighborhood of 6) to be simultaneously true 
in a neighborhood of @. 

As was mentioned in the introduction, given 6 ¢ @,U 0, 


(81) PelN > nl < Pv | int Slog fX2%>”) (+4) 1 
A) 6’ nj = 8 ale —~ og E A(X9, , AM) < _— v2 ogc 


for all 6’ ¢ h(@). ( Notice that if @¢ 0,U ©, then by B6, h(@)N a(@) is empty.) 
For each ¢ € a(@), choose V = V(g, 6, 6) ¢ 3* containing ¢ so that 


, [wl¥.,V,e) | 
“_ are ae 0, ¢) | 


for some t = t(6,¢) > 0 whenever 6’ is in some (3) neighborhood Q = Q(@, ¢, 4) 
of @ and 


; = (1(0, v, e) — 61(0,7:)/2(.1 +5), if I(0,¢,e) < ~ 
(8.3, 4) I(0,V,e) > ri Cathie & tke & «<: 


((8.2) is possible by A8-a, (8.3) is possible by Lemma 2, and (8.4) is possible 
by Lemma 6.) Since a(@) is (3*) closed and hence compact, there is a finite set 
of points {gi , --+ , ¢} (r = r(0, 6, 1) ), for which 


(8.5) a(@) C UV; (where V; = V(¢;, 8, 8)). 
i=] 
Hence, under A(7y;, y2), 


(8.6) Py (N >n| 3 > Py 2% (0) > (1 + ye) loge] 
t=] 
for all 6’ ¢h(6), where 
w(X, V;, rx”) 
f(X®, 6, A® 
(Keep in mind the fact that v;,;(@’) depends upon @ implicity through V; = 
Vig;, 6, 5), and that, by definition, 


(8.7) 05,4 (0) = log 


9 


(88) Ew le;(0) + (0, Vi, r%) |X, X®, ---, X*"] =0. 


Now, suppose that 
(8.9) > —(1 + y2)(1 + 8) logce/I(@, 11). 
Then 


(8.10) Ps ([N > n] < >> Py > v;(0’) + 1(0, m1:)/(1 + 6) > o|, 
t=] j=l 

Let 

(8.11) J(0’, 0) = {i: 1(0’, Vi, e) < @ for alle € &}. 
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By B3, there is a (3) neighborhood Q of @ such that [(6’, V;, e) < = 
if 1(6, V;,e) < ~ and & ¢Q. Hence, for all i ¢ J(@, 6), and all @’ <Q, 


Py bP (v;,(6") + 1(6’,1)/(1 + 8)) > o| 
(8.12) 


Joni 


< Pw (Ci; > 0] + Po [Co > 0] + Po [Csi > 0), 


where 


(8.12a) Cy = >> fv;..(0’) + 1(6’, Vi, x”) — 61(6’, 11) /4(1 + 8)], 
j=l 


(8.12b) Cx = >> [—1(0’, Vi, x) + 106’, Vi, Xe?) — 81(0’, 11) /4(1 + 8)), 
j=l 


and 
(8.12c) Cs = >> [—1(0, Vi, 0") + 1(0, 1) — 81(0’, 11)/2(1 + 8))]. 
j=l 


Given the first 7 — 1 trials, C,; has a finite moment generating function in some 
neighborhood of @ and a negative mean at @. Applying Lemma 5 as in Theorem 1, 
we can show that there is a (3) neighborhood Q of 6 and a positive cinstant ), , 
(Q and b, depend upon @ and 6) such that 


(8.13) Py (Ci, > 0] s e&” 


for all 6’ ¢Q, and ieJ(@, 6). As a direct consequence of (8.3), (8.4) 
and Lemma 8, 


(8.14) I(6, Vi, ds") > 1(8, x1) — 81(0, n)/2(1 + 8) 
for i ¢ J(@, @) (in fact, for all 7, but we don’t use this). By B1 and B4, there is 
a (3) neighborhood Q of @ (depending upon 6 and y,) for which 
(8.15) 1(0’, Vi, 0?) > 1(6’, 11) — 61( 6’, 11) /2(1 + 8) 
whenever 6’ ¢ Q and i ¢ J(6, 6). Consequently, 
(8.16) Py (C3; > 0] = 0 
whenever 6’ ¢ Q. 
In order to deal with the expressions P,[C,; > 0], we recall that B3 and B4 


permit us to choose a (3) neighborhood Q* C h(6@) of 6, (Q* depends upon @ 
and 6) for which the following statements are simultaneously true: 


(8.17) A = A(@,5) = sup {max [ max |/(6’, V;, e) —1(6, Vi, e’)\]} < @, 
¢’ ) 


eQ* e,e’e8 ieJs(o,@ 
(8.18) \1(6’, Vi, 4) — 1(0”, Vi, )| < wee/8 
for all A ¢ A,, , and 
(8.18b) \1(0’, Vi, 62) — 1(0”, Vi, X0")| < wee/8 
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for all 6’, 6” e Q* and alli e J(@, 6), 
where 
(8.19) Boe = infyr, ge 61( 6’, y1)/2(1 + 5), 
is positive by virtue of Lemma 8 and B1. 
Let T a be as defined in Theorem 1. If 7 ¢ J(6, 6), then for all 6’ ¢ Q*, 
(8.20) Co: = Dis + Dai + Dai + Dai 


where by (8.17), 


(821) Dyse > [—2(0, Vi, x») + 1(6, Vi, d3")] Ss AT, 


1sjSTQ* 


(8.22) Da > [-1(0', Vi, x) + 1(6;, Vi, X31)] S nuer/8 
Te*sisn 


(since \°” = dj} and 6; eQ* C 0,U ©, for j > Te), 
(8.23) Ds = DY (—1(6;, Vi, 33) + 1(6', Vi, ¥4)] S nw ee/8 
Te*<jsn 

(since 6’ and 6; are in Q* for j > T'g+), and 
(8.24) Dy = —nél( 0’, y:)/4(1 + 6) S —npe/2. 

Thus, for all 6’ e Q*, ie J(@, 8), 

.25) Po[C2, > 0] S Po[T ge > ny ge/4d). 

By Theorem 1, there are finite positive constants k, and b: , and a (3) neigh- 
borhood Q of @ such that 
(8.26) Po[T oe > nuge/4A] S ke exp ( —ben) 


for all 6’ ¢ Q. (ke, be and Q depend upon @, 6 and 7; .) 

Combining (8.12), (8.13), (8.16), (8.25) and (8.26), we see that there are 
finite positive constants b; and k; and a (3) neighborhood Q of @, such that for 
 €Q, 


(8.27) a Py [> v;,.(0') > —nI(@’,¥1)/1 + | S kz exp (—b;n). 
te J(6, 6) j=l 
If ig J(6, 6) then Ew;,(@) = —* under A(y, 72) and the technique of 
Theorem 1 will establish that there are finite positive constants k, and b, and a 
(3) neighborhood Q of 6 for which 


aos) 2. Be | v;,(0) > —nI(6’,y1)/1 + | < ky exp (—n) 


i ¢J(06, @) 


joi 
whenever 6’ ¢ Q. By adding (8.27) and (8.28), and comparing with (8.10) the 
conclusion follows. This brings us to the main result of this section: 

THEOREM 2. Let @ be a point of 0, U ©: and let « > 0, 11(0 <1 < 1/M), 
and y2(y2 > 0), be given. Then there is a function §(c) = £(c; 8, €, 11) which (for 
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fixed 6, € and y,) tends to zero as c approaches zero, and a (3) neighborhood Q = 
Q(0, €, 71) of 0, such that for all # ¢Q 


Eo(N) S —(1 + v2)(1 + € + &(c)) log c/I(0’, 11) 


under procedure A(7 , Y2). 
Proor. For any n* 2 1 


Ey (N) Ss n* + >> Po[N > jj. 
j2n* 


By Lemma 10, there are finite positive constants k and b and a (3) neighborhood 
Q of 6 (all depending upon 6, « and y,) for which 


Py|N > n| Ss k exp (—bn) 


whenever 6’ ¢ Q and n = n* = —(1 + €)(1 + y2) log c/I(@’, y,). Thus, under 
A(y , ¥2) 


Ew(N) Ss n* + k’ exp (—bn*) 


whenever 6’ ¢ Q. By Lemma 9 and B1, we can assure without loss of generality, 
that Q is chosen so that /(6’, y,) is bounded and bounded away from zero in Q. 
The desired result follows with £(c) = —k’’c”/log c, where k”’ and b” are ap- 
propriately defined positive constants. 


9. Bounds on the probability of error under A(7:, 7:2). In this section we 
show that for0 < y, < 1/M andy, > 0, the probability of error under A(7; , v2) 
is O(c) (i.e., less than or equal to a constant multiple of c). The essential idea is 
sketched in 

THEOREM 3. If y,, (0 < y1 < 1/M), and y2, (v2 > 0), are given and 6 is a 
point 8,U @,, then there is a constant W and a (3) neighborhood Q of @ for which 
the probability of error under A(y; , y2) 78 


a(é’) S We 


uniformly for 6’ « Q. (W and Q depend upon 4; , y2 and @.) 
OUTLINE OF Proor. Let 6 be the true state of nature. By B6, @ is an interior 
point of its hypothesis and hence, there is a set A* ¢ 3* such that 


6¢A*N OC A(86). 


If an error is committed on the nth trial, then 


n Kx? 3 i) 
s log —>~——_—~ 2 - 
re 2 og f(X9, 0, x) = 
for all &’ ¢ A*/N @. 
If we properly combine Lemmas 2 and 4 with A4 and A8-b, and use the com- 
pactness of 0* — A*, it is possible to pick a finite collection of sets 
{Vi, Ve, --: , Vp} © 3&* having the properties that 


(1 + 2) log c 
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f(X™, 8, X”) w(X”, V;, x) 
s ] < ae 3 Teen 2 
ve a A* : °6 f(X9, 6’, A) as 2 log log f(X9, 0, A”) 


for all 6’ ¢ A*/N 0, and 


, w( (x. V;, i) ‘ 
Eo, | exp (1 + 2)” log log FXO. 9, xo) <g 


(where 0 < g < 1) for all 6’ in some (3) neighborhood of @. 
We now can apply Lemma 5 and obtain 


o x? 1° 
o(0) 5 & Po [¥ tog MATA) 2 C1 + 7) loge 


< pedi g" 


n=l 
for all 6’ in some (3) neighborhood Q of @. 


10. Bounds on the risk under A(7;, 72). At this point, we are almost ready 
to combine Theorems 2 and 3. However, a lemma concerning the relationship 
between J(6, y) and J(@) is required: 

Lemma 12. Let K be any (3) compact subset of 8,U ©, . Then, lim,.oI(6,y) = 
I(@) uniformly for 6 K. 

PROOF. By definition, /(6, y) = I(@) = I(6, 0), for all y. Let A*{e} = 
(1 — My)Agle} + y. (A* € A,.) Then 


1(6, ¢,\*) = (1 — My)I(0, g, 8) + 1d I(6,¢,e) = (1 — My)I(0, ¢, de). 
But then, 


1(6,y) = inf I1(0,¢,r4*) => (1 — My) inf 1(6,¢, 3) = (1 — My)I(@, 0) 


@ e a(@) @ € a(@) 


So, 
1(0,0) = limy.o J(0, y) 2 1(6, 0). 


Since the convergence is monotonic and since J(@, 7) and /(6,0) are (3) 
continuous, the convergence is uniform on (3) compacta. 

THeoreM 4. Let K be a (3) compact subset of 0, U ©2, over which the regret 
function is bounded. If «(« > 0) is given, then for sufficiently small y:(y1 > 0), 
and y2(7v2 > 0), the risk under A(y1 , y2) is 


R(@) Ss —(1 + € + &*(c) Jc log c/1(6@) 


for all 6 ¢ K. (Here, &*(c) depends upon K, ¢ and c, but tends to zero as ¢ ap- 
proaches zero). 
Proor. Let 6 ¢ K be given and let 6 = ¢«/e + 2. Choose 72 so that 0 < y2 < 


(6/2)(1 + 6/2)”. By Theorems 2 and 3 there is a (3) neighborhood Q(@) of @ 
such that 
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Eu (N) S —(1 + y2)(1 + 6/2 + &(e; 6, 1)) log c/I(@’, 11), 
and a(6’) < We for all 6 ¢ Q(@). The risk is, by definition, 
R(6’) S cEy(N) + rra(6’) 
for all 6’ « K (where rx is the upper bound for the regret over K). Combining 
Theorems 2 and 3 
R(@’) Ss —(1 + 6 + &'(c; 4, 7, €))e log c/I( 0’, v1) 


for 6’ « Q(6), (where ¢’(c; 6, 71, €) approaches zero as c > 0). 
Since K is compact, a finite number of neighborhoods Q(@) cover K: 
K CUiL; Q(6,). Let 


E(c, K, €,¥1) = MAXja1..-s E’(c, 65,71, €). 


For all 6 « K, 
R(6) s —(1+ 6+ &e; K, €, 71) )e log c/I(0, 1). 


By Lemma 12, we can pick ¥: so small that J(@,y:) > (1 + 6)J(@) for all 
6K. Let 


t*(c) = &(c)/1 — 6 
and the conclusion follows. 


11. Comparison with other procedures. This section will serve to establish 
the optimality of the class of procedures {A(7: , y2)} in the following sense: If 
a procedure B has risk R(6’) < (1 + o0(1))c log c/I(@’) for some 6’ ¢ 8,U 62, 
then for some other 6’’ ¢ 6,U ©, , R(6’’) is of a greater order of magnitude than 
—c loge: 


(i.e., lim sup ¢+0 R(@”)/(—c logc) = @). 


Thus, the risk under procedure B is greater (by an order of magnitude) for 
some 6’ ¢ 8,U ©, if it is significantly smaller than that of A(y, 2) for any 
other 6’. 

Three preliminary lemmas are required. The first is a theorem about convex 
sets: 

Lemma 13. Given 6 € 0,U ©» and 6 > 0, there is a finite set 6; = (0) C a(@) 
having the property that 


max, -,1(0,9,e) < for all g € ®; 
and 
max, - a ming, «, /(0,¢, 4) S 1(@) + 6/2. 


Proor. Let S = {s:s = (1(0, ¢, e:), --- , 1(0, ¢, em)), ge a(@)}, let Ss be 
the convex hull of S» and let 8 be the closure (in M dimensional Euclidean 
space) of S¢ . Define a function m on A x 53 by: m(A,s) = >, sAfed (where, 
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of course, Ss = (8, -°-:,8)). By Theorem 2.2.7 of [1], max,,,4 m(A, s) = 
max: <i<m 8; 1s continuous and convex on 33 . Since I(6,¢,e) 2 O for allg € a(@), 
33 is a closed subset of the positive orthant and max), m(A, s) 2 O for all 
s 3; . It follows that max, , 4 m(X, 8) achieves its minimum on 39 (say at 
S = (3, ---, S)). Let 


1(6) = MiN, ¢ g; MAX, ¢ 4 m(A, S) = Max, , a m(A, S). 


In particular, [(@) = mMaXi<i<u &. Since § is a point of closure of $3 , there 
is a point s* = (st , ***, 8a) in S$ , such that MAXi <k<M s&s < maxi<icm & + 
6/2 = 1(@) + 6/2. By Theorem 2.2.2. of [1], s* is a convex combination of a 
set of M’ < M +1 points s” of & : 


mM’ mM’ 
‘a 
s* = as a;s where a; > 0, zz a; = 1. 
t=] t=] 


Let ¢’” be such that 


(4) 


s\” = (I1(0,¢°", ea), «++ 1(0, 9°", em)) #=1,2,---,M. 


(2) 


Let 5 = {y”, y”, ---, o\™}. Since sf < 1(6) + 5/2, fork = 1,2, -+- M, 
it follows that 
min, -, 1(6,¢,e) < 1(6) + 8/2, for each e, 
so that 
max, - a Ming, «, 1(0,¢,e) < 1(@) + 6/2. 


But, 


max [ inf m(A,s)] S max[ inf m(\,s)] = max[ inf J[(6,¢,A)] =1(6). 
AeA se 3 AeA # @ $e AeA ¢ ¢ a(@) 


By Theorem 2.4.2. of [1], 


max [ inf m(A,s)] = min [max m(A, s)] = 7(6), 
heA 2055 eegp eA 


so that 7(@) < 1(@). By Bl, 1(@) < @ for 6¢0,U @:, so that [(@) < « 
Hence, 


max I(@,¢,¢) < « for all g €®; ; 


eeg 


for if 1(0, ¢‘”, e;) = ~, then s$” = o, contradicting the fact that 


mM’ 
max s; = max >, a,1(0,¢", e&) < 1(6) + 6/2. 


isksM isksM i=l 


This establishes the lemma, since 


max. a [ming . o, 1(6, ¢, e)] S 1(6) + 6/2 < 1(6) + 6/2. 
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DEFINITION; 
3 a x, @ ao. 

The next two lemmas establish the as after optimality property. The 
first says that Sy(@, ¢) must be large with high probability if the probability of 
error is to be small at @ and g. The second shows that the rate of growth of 
S,(6, ¢) is such that n must be very large in order to make S,(6, ¢) large. 
Together, these lemmas show that the expected sample size must be large if 
the probability of error is to be kept small. (Here, N is the sample size required 
to reach a terminal decision. ) 

Lemma 14. Suppose @¢0,U @2., ¢g €a(@) and procedure B has the property 
that a(@) = O(—c logc) and a(g) = O( —c log c). Then for any 6(0 < 6 < 1), 


P,{Sw(0, ¢) < —(1 — 8) loge] = o(—c’ log c). 


S,(0,¢) = > log 


Proor. Assume (without loss of generality) that @ «0, . Let 
B, = [H, is accepted on the nth trial]M [Sy(@,¢) < —(1 — &) loge]. 
Then, 
Pi Sy(0, ¢) < —(1 — 8) loge] S >> PoB,] + PolH; is rejected] 
< > PB] + O( —c log c). 


Since 


P, accept H,) = O(—c loge) 2 » P.{B,] = =f I f(x, @, X) du(x™) 


B, j=l 


=> [ game I f(x”, 0, X”) du(x) = exp (1 — 8) loge dX P,{[B,] 
it follows that 
Pil Sw(0, ¢) < —(1 — 8) loge] s & °° O( —c log c) = O( —c’ log c). 
LemMA 15. Given @ € 0,U ©, and 6 > 0, 
Po{maxi<cms<n Min, o, Sn(O,¢) = n[I(@) + 4]] = O(1/n). 
Proor. By Lemma 13, min,, «, /(6, ¢, 4) S 1(@) + 6/2 for all \ € A. Hence, 


min > 1(6, ¢, \°) = m ( nin I(6, ¢, *)) & m(I(@) + 8/2) 


ee %s j=l 


for any set of (randomized experiments) {A”, --- , A°”’} where 


A*{e} = (1/m) Me, (so that A* € A). 
j=l 
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Zs (¢) = on 2 E frat A oa I(6, ; 0? | 


Zn (v) = 2 1(0, 9, X” 


j=l 
If n = m, ming. Z(~) < n(1(0) + 6/2), so that if 
min, - a, (Zm(¢) + Za’ (¢)) = n(I(@) + 8), 
then 


maxy.s, Zn’ (¢) > né/‘ for n = m. 
Since 
Sn(0,¢) = Zn'(¢v) + ZO (¢ 
P,{ max min S,,(0,¢) 2 n(/(@) + 6)| Ss 2. - max ZS(¢) = nb/2}. 
lsmsn ge % lsmsn 


For each ¢, {Z2(¢)} is a martingale sequence, so that 12 (¢) 1} is a semi- 


martingale. By applying Theorem 3.2 of [4] we obtain 
Po max |Z2(¢)| = nd/2] S 4p |Z0?(¢)[?/n’e. 


lsmsn 


Let 


fa oo) .. 5 al, 


2 er an, 3 
si ashi be! PE 


Since ¢ ¢ ®; C a(@), we have, by Lemma 12, that /(0,¢,e) < © for all ee &. 
By (i, o(0,¢) < ~ forallge;. 
Let o3(0) = >> 0° (0, ¢). 
eed; 
Since 
Es |Z.’ (¢) |? S no’(0, ¢), 


we conclude that 


Polmaxicm<n Ming. s, Sn(9, ¢) 2 n(1(@) + 6)) S 4o3/n8 = O(1/n). 


The main theorem of this section now follows readily. 


TuHeoreM 5. If a procedure B has risk R(@) = O( —c loge) for each @ ¢ 8,U Oe, 
then 


R(@) = —(1 + o(1))e log c/I(6) 
for all@e20,U &®. 
Proor. Let n* = n*(c, 6) = —[(1 — 6) log c]/[7(@) + 4], (0 < 6 < 1). 
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PN S n*| S Polmaxicmens Ming. o, Sn( 9, ¢) = n*(1(0) + 8)] 
+ P,{min,.«, Sw(0,¢) S —(1 — 4) log cl. 
(This is so because 


MAXi<m<n* MiNg.o, Su(O,¢) 2 n*(1(0) + 8) 


whenever N S n* and min,,.«, Sw(@,¢) 2 —(1 — 8) loge.) 

Since R(@) = O( —c log cc) for each @ ¢ 6,U 6, , Lemma 14 applies to each 6 
and each gea(é). (Since R(@) = cEs(N) + r(0)a(@) = O( —c loge) and 
since r(@) > 0 on 6,U @, a(@) = O(—c loge) on 6,U ©.) In particular, 


P,{min Sy(@,¢) S —(1 — 4) loge] 


geo; 


< Py; PfSy(0, ¢) S —(1 — 8) loge] = O(—e’ log c). 


ee; 
By Lemma 15 


Pf max min S,,(0,¢) = n*(I(@) + 8)] = O((—logc)™’). 


lsmsgn* ge %; 
Hence, 


_Q — 6)(1 + o(1)) loge 


7 T > *p T + a ns 
E.y(N) = n*P.[N > n*] T® +8 





Consequently, 


—(1 + 0(1))(1 — 5) cloge 


R(@) = cEy(N) = 1(0) +6 





for all (0 < 6 < 1). Hence, 
R(@) = —(1 + o(1) )e log c/I(@) 


which was to be proved. 


12. Concluding remarks. (a). The optimal properties of the class of procedures 
{A(y1, ¥2)} have been established only for those points @ where the regret is 
positive. It is quite likely that these procedures are not optimal when the true 
state of nature lies in a region where the regret is zero. The core of the difficulty 
lies in the fact that for most meaningful statistical problems, /(@) is zero on 
the boundary between the two hypotheses, rendering Theorem 2 virtually useless. 
To put it another way, when @ is a boundary point, the likelihood ratio tends to 
be small in magnitude, causing the expected sample size to be large. 

(b). In any particular case, the choice of (6*, 3*) to compactify 6 need not 
be unique. However, there seems to be a natural method of determining a suit- 
able compactification of 6 (if one exists at all): With each point 6 « 8, we can 
associate a point in a function space, ¥(@) = (f(-, 0, @:), --: ,f(-, 9, em)). If 
we denote this function space by £ and let £* be the set of limit points of £ 
(in the sense of almost sure convergence), it seems natural to define 6* so that 
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the domain of ¥(-) can be extended in such a way that $(-) now takes values 
in £*. The topology on 6* will most naturally be one for which component-wise 
continuity for F can be established. 

For the prototype example, 9 is Euclidean two space and the function space 
£, consists of all functions F(m;, m2) = (27) *e Homey)? er wr. as m, and 
ms range over the real line. £* consists of all functions F(m;, m2) as m; and mz 
range over the extended real line. It seems quite natural to take ©* to be all 
points of the form 


g = (m,m), —*© Sm Me 


and the obvious topology on the enlarged set 6* will satisfy the conditions set 
forth in our assumptions. The relativization of 3* to 6 will be the usual topology 
on R’. Alternatively, we could take 0} to be all points of the form 


¢ = (m,m), —% <m,m Sx 


and let 35 be the relativization of 3* to 63. Again, 39 wil satisfy the required 
conditions and the relativization of 39 to 0 is also the usual Euclidean topology. 

(c). Finally, a word should be said concerning the apparent complexity of the 
class of procedures {A (7: , y2)}. The use of the p-p.m.Le. 6, , (with 0 < p < 1), 
instead of the seemingly more tractable maximum likelihood estimate 6, , is 
necessitated by the fact that 6, may not exist in @ whereas, 6, always will. (When 
6 is a finite set, as was the case in [2], 6, will always exist in © and hence, it is 
permissible to take p = 1.) : 

In order to guarantee the consistency of 6, , it is, in general, necessary that 
the randomized experiments \” put positive weight on each e in &. In our proto- 
type problem for instance, suppose that the experimental rule dictates that e, be 
performed on the first trial and e. be performed thereafter. Then 6(?n , on) 
will not converge to the true parameter. To circumvent this difficulty, we re- 
quire that the \‘”’’s be chosen from A, . 

Chernoff recognized this difficulty in [2] and he proposed a different modifica- 
tion of the experimental rule which allowed him to choose his experiments from 
the larger class A. However, it appears that this technique is not readily analyz- 
able in the case where @ is infinite. 

Under the stopping rule given in [2], the probability of error is O(c). When 
the parameter space is infinite however, this author can prove only that the 
probability of error is O( c!'*72) for any y2 > 0. By modifying the stopping rule 
so that sampling ceases when the likelihood ratio is greater than ¢ “*” (for c 
between zero and one) instead of c* (as in [2]) we too can attain a probability 
of error which is O(c). 

(d). It seems natural and desireable to extend the results contained here to 
the case where & is an infinite set. Such a result would have many applications 
to questions arising in connection with statistical inference on time series. It 
appears as though suitable continuity restrictions on f(y, 8, -) would permit the 
application of the techniques employed here to establish the necessary results. 
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ANALYSIS OF A CLASS OF PBIB DESIGNS WITH MORE THAN 
TWO ASSOCIATE CLASSES 


By P. V. Rao 
College of Science, Nagpur, India‘ 


1. Introduction and Summary. The use of 2-associate PBIB design is fairly 
common in experimental work. However, PBIB designs with more than two asso- 
ciate classes are not widely used because of the complicated nature of the analysis 
and construction involved. Recently, in an interesting paper [2], Shah constructed 
a number of 3-associate PBIB designs by what may be called the matrix sub- 
stitution method. In this method, the incidence matrix of the 3-associate PBIB 
design is constructed by replacing the integers of a balanced matrix in S integers 
(for example, the matrix might be the incidence matrix of a BIB design, that is, 
a balanced matrix in two integers) by the incidence matrices of S associable 
BIB designs. The present author [1] and Shah [3] have shown that the above 
method may be used to construct a PBIB design with 2m + 1 associate classes 
by replacing the integers of a PBIB design with m associate classes by the inci- 
dence matrices of two associable BIB designs. Shah [3] has also given a simple 
method of analysis for PBIB designs with 2" — 1 associate classes constructed 
by the matrix substitution method. Thus, Shah’s method of analysis may be used 
to analyze PBIB designs with 3 and 7 associate classes (corresponding to n = 2 
and 3) which are of practical interest. In the present paper, simple methods of 
analysis for a class of PBIB designs with 3 and 5 associate classes are given. The 
method given here for PBIB designs with three associate classes provides an 
alternate method, and it is hoped that this method is more simple and direct 
than that given by Shah. 


2. Notation and some known results. The symbol ‘®’ will be used to denote 
the Kronecker product of two matrices. Thus, if A = (a;;) and B = (b;;) are 
two m X n and p X q matrices we have, 


QB, a.B, --- anB 
A ® B cape . . eee . 
Omi B, Qm2B, +++ Onn B 
The following square matrix: 


9 
-? 


Ai, 


fe, oe+ el 
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where A, and A, are scalars or square matrices of the same order, will be denoted 
by (A; \A:), , where the subscript v stands for the order of A when A, and A, are 
considered as its elements. It can be easily verified that 


(2.1) (cA, CoAs), = I(v) ® (cA, oe CoAs ) + E(v) ® CoAs , 


where ¢, and ¢2 are scalars, I(v) is the v X v identity matrix, and E(v) is the 
v X v matrix with all elements equal to unity. 

Every design will be denoted by its incidence matrix. Two designs will be called 
associable if they satisfy the definition given by Shah [2]. It is known that, if No 
and N; are associable BIB designs with parameters 0; , b; , ro, ko, Xo 3 U1, OL, M1 
ky, 1; and yo. = uw, mm = 7, and, if A is a PBIB design with parameters v2 , be , 
ro, ke 3 Miz, Aor, *** » Ame 3 Phu(u, uw’, g = 1, 2, ---,m), then the design N, ob- 
tained from A by replacing the integer 7 by N; (7 = 0, 1), is a PBIB design with 
2m + 1 associate classes and parameters v = 0402 ,b = bybe yr = rire + (v2 — 7r2)ro, 
k = kyke + (ve — ke)ko. 

Let y;; denote the yield of the plot in the jth block of N to which the ith treat- 
ment is applied. For the purpose of the analysis we assume the model 


(2.2) Yj =~ att+b6,+ «;, 


where a, t; , b; are respectively the general effect, the effect of the ith treatment 
and the effect of the jth block. The ¢;; are independent normal variates with 
mean 0 and variance o. Let 7; and B; denote respectively the total yield due to 
the ith treatment and the jth block of N. If the column vectors of 
(T,, Te, --:, Te), (Bi, Be, --+, Bs), (th, te, --*, ty) and (4,, &, --- , #,) are 
denoted by T, B, t, and ¢ respectively, then it is well known that the reduced 
normal equations for the intra-block estimation of treatment contrasts are 


(2.3) Q = Ct, 


where 


(2.4) Q = T — (1/k)NB, 
and 
C = rl(v) — (1/k)NN’. 
The adjusted treatment sum of squares is equal to 
tQ = > iQ. 


Further, if 


(2.7) t; = diaQs + d2Qe + i,m + divQ» 


is a solution of the normal equations, then the variance of the best estimate of 
any estimable parametric function Di cits is given by 


(2.8) V( > eds) = ( a, >> cx i;)o". 
i t 3 
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Let N be a PBIB design with parameters v, b, r, k; \y, 2, °°: , Am} 
Pou (u, u’,qg = 1,2, --- ,m). Define m + 1v X v matrices B, (¢ = 0, 1, --- , m) 
as 
(2.9) B, = (bi;), ij= 
where 

bi; = 1 if the ith and jth treatments are qth associates in N 


= 0 otherwise, 


with a convention that every treatment is its own Oth associate. Then it is easy 
to show that 


(2.10) Bo + B, + --- +B, = E(v), 
and 
(2.11) NN’ = rBo os dB, + as ae Rien ° 

3. A Lemma. 

Lemma 3.1: Let A be a 2-associate PBIB design with parameters v2 , be, r2 , ke ; 
Ars, Ave 3 Pius and let No and N;, be two associable BIB designs with parameters 0, , 
bi, To, Ko, Xo 3 1, O1, 1, Ki, Ar 3 w and n. If N is a design obtained by replacing 
the integer i in A by the matrix N; (4 = 0, 1), then 
(3.1) NWN’ = AA’ ® {e,1 (0) a C2E(v;)] ad E(v-2) ® [esl (v;) + c4E(v;)], 
where 

(ry + ro — 2u) — (Ar + Ao — 2), 

Ma + he = 2n, 
= [bero — 2re(ro — w)| — [boro — 2re(Ao — 7n)], 
— bedXo _ 2re(Ao = Os. 


Proor: If we consider N as a partitioned matrix with elements Np and N,, it 
is easy to verify that the (7j)th element in the product NN’ is equal to 


roNiN} + (b2 — ro) NoNo ifs = j, 
ANN; + 2(r2 — Aw)NoNi + (b2 — 2r2 + Arz)NoNo 


if the treatments 7 and j are first associates in A, 
oNIN: + 2(r2 — hoo) NoN; + (b2 — 2re + doo) NoNo 


if the treatments 7 and j are second associates in A. 
Defining B, (q¢ = 0, 1, 2) as in (2.9), it follows that 


NN’ = By ® [r.N.Ni + (b2 — r2)NoNo] + Bi @ [AwNiNi + 2(r2 — Aw)NONi 
+ (by — 2rz + Ai2)NoNo] + Be @® [ANiNi + 2(r2 — Aoe)NoNi 
+ (by — 2re + Ave) NoNo] 
= [r2Bo + 2B, + AxnBo] @ NiNi + [(b2 — 2r2)(Bo + Bi + Bs) + Bo 
+ 2B: + AxBs] ® NoNo + 2[r2(Bo + Bi + B2) -- (r2Bo + \i2Bi + AveBe)] 
® NN}. 
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From (2.10) and (2.11) we have 
Bo + B, + B. = E(v:), r2Bo + AwBi + A»B, = AA’. 
Hence from (3.2) it follows that 
NN’ = AA’ @ [NiNi + NoNo — 2NNi) 
+ E(v2) ® [(b2 — 2r2)NoNo + 2r.NoNi). 


Since Ny and N;, are associable BIB designs with parameters of association wu and 
n, it follows that 


(3.3) 


NoNo = (ro — do)I(v1) + AoE(01), 
NAN: = (m1 — A)I(r1) + ME(n), 
NON; = (u — n)1(v1) + nE(n). 
Substituting for NoNo , N,N; and N,N; in (3.3) it is easy to verify that 


NN’ = AA’ ® leI(v1) + c2E(v,)] + Eve) ® [e(vi) + cE(v;)], 


which completes the proof. 

Most of the designs of practical interest will occur when N is taken as a null 
matrix or as the incidence matrix of a randomized block design. The expression 
for NN’ may be further simplified in these two cases. The values of c; , cz , ¢3 , c4 
are given below for the two particular cases: 

Case i: 

No = O(n, X bi), 
where O(v; X b,) is the v; X b; null matrix. Since ro = ko = Ao = wp = 9 = O, 


we have ¢ = 71) — \, C2 = A, Cs = Oand eg = 0. 
Case ii: : 


N, = E(2 a. b;) 


where E(x; X 6) is av; X b; matrix with all elements equal to unity. Since 
ro = =bh, ko = 1,4 = 9 = 1, we havec, = 7) — i, Ce = A +b; — 27, 
cs = Oand cy = deb; — 2r2(b; — 11). 

It is known that N defined in Lemma 3.1 is the incidence matrix of a PBIB 
design with five associate classes. The reduced normal equations for the intra- 
block estimates of treatment comparisons are Q = Cf, where 

C = rl(vyw.) — (1/k)NN’, r = Tye + (ve — f2)ro, 
(3.4) 
k = kk, + (v2 — ke) ko . 

4. Analysis of the design N in particular cases. 

Case (i): Let A be a BIB design with parameters v2 , be , r2 , ke , Kuz = Awe = De. 


2 oF 


We then have AA’ = (rz — X2)I(v2) + ALE(v2). Hence from Lemma 3.1 
NN’ na [(r2 — de) I(v2) + A2E(v2)] ® [e\1(v;) + C2E(v)] 
+ E(v2) ® [es(v1) + caE(v1)]. 





804 
Hence 


kC = [rk — ex(re — do) |E(ve) ® I(v1) — (es + crs) E(ve) ® I(r) 
— Co(T2 — Ao) (ve) @® E(u) — (ce, + core) E(ve) ® E(v). 


(4.1) 


Now let the number pair (77) denote the jth treatment which replaces the ith 
row of A (4 = 1, 2,---,u,j = 1, 2, +--+, ) in N. This means that the rows 
of N are numbered as (11), (12), --- , (Ive), (21), (22), --+ , (Que), --- , (m1), 
(1:2), --- , (ve). Also, let t;; and Q;; denote the effect and the adjusted total 
yield of the treatment (7j). In view of (4.1) the normal equations can be written 
as 

kQi; = Usb; — tks. — Usb. ; — ul... 
where 
= rk — Gifs — dz), 
= C3 + C1A4 ; 
3 = C2(T2 — do), 
Us = C4 + Core, 


and é;., é.; and ¢.. have their usual meanings. Taking the additional equation 


“ 


t.. = 0, we can solve the normal equations uniquely and get the solution 


(4.3) b; -_ d,Q;; + d2Q);. + d3Q). ; , 


where 
d, = k/u, dz = kue/[uj(u — veue)), dz = kus/[us(u — vyus)), 


and Q,. and Q. ; have their usual meanings. Hence, from (2.6), the adjusted treat- 
ment sum of squares for testing the overall differences between the treatment 
effects is 


(4.4) > dH 4Q5 = dd G+ DQ. + ad e;. 


(4.4) gives a very simple expression for the computation of adjusted treatment 
sum of squares from the adjusted treatment totals. Using (2.8) in the solution 
(4.3) we can get the variances of the estimates of various elementary treatment 
comparisons. These are given in Table I. 


Case (ii): Let A be a group divisible design with parameters v. = mn, be, re, 
ke ; \u2, Aw . If the jth treatment in the ith group is numbered as (4 — 1)n + j 
(¢ = 1,2,---,m,j = 1, 2,---, 7) it is easy to verify that 


(4.5) AA’ = (A;\ Ao)» = I(m) @ (A; — As) + E(m) @® Az, 
where 
A, (ro — Aw)I(n) + AwpE(n), 
An E(n). 
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TABLE I 


Treatment Comparison. Number of Comparisons. Variance of the estimate 


ti; _— tis Q a 7) Ved, (0,; — 1 2(d; aa d;)a? 
ts; = tis(t x i) V\V2(V2e — 1) 2(d, + dz)a? 
ti; = tgrze (0 ~ ta # j’) V\V2(0; — 1) (ve — 1) 2(d, a. d, + d;)c? 


Hence 
AA’ = (re — Ayw)I(m) @ I(n) + (Aw — Aw) I(m) 

o @® E(n) + AwnE(m) @ E(n). 
Therefore, from Lemma 3.1, 

NN’ = (rk — w)I(m) @ I(n) @ I(;) + wl(m) ® E(n) ® I(2,) 
(4.7) + uwzE(m) ® E(n) @ I(n1) + ud(m) ® I(n) @ E(n;) 

+ uI(m) ® E(n) ® E(x) + wE(m) ® E(n) ® E(2), 
where 
= rk — c,(re — Ax), 


Ci(Aiz — Ave), 


Co(A12 — Azz), 
Ca + Codoe . 
Hence 
kC = uwl(m) ® I(n) @ I(v,) — wl(m) ® E(n) ® I(n,) 
— wE(m) ® E(n) ®@ I(v1) — ud(m) ® I(n) ® E(n,) 
— usl(m) ® E(n) ® E(x) — wE(m) ® E(n) ® E(2). 


Let the number triplet (7jq) denote the gth treatment in N which replaces the 
treatment numbered (i — 1)n + j of A (i = 1, 2,---,m,j = 1,2,---,n, 
q = 1, 2, --- , 1%). Also, let t;;, and Q;;, denote the effect and the adjusted total 
for the treatment (7jq). In view of (4.9), it can be verified that the reduced nor- 
mal equations may be written as 


(4.10) kKQiig = ttiig — Uabi-g _ Ush-.g — usb 5. — tugh;.. — Ug... , 


where #,., etc., have their usual meanings. Taking the additional equation ?¢... = 0, 
we can solve (4.10) uniquely and get the solution as 


(4.11) big = UQiig + UQig + d:Qij. + diQ;.. + dsQ..¢, 
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where 
d, = k/m, dz = kue/[uj(u, — nue)), dz = kus/[us(u. — vyuy)], 
dy = {k/[u(u — nue — vy — nvyzus)}} 
+{uo(ug + nus)/ (Uy — nue) + use + vyUs)/( — viU4)}, 


ds = kuz/[u(u — nue)(u — nue — mnus)), 


and uw, Ww, °°: , ete., are defined as in (4.8) and Q;., --- , ete., have their usual 
meanings. Hence the adjusted treatment sum of squares for testing the overall 
differences between the treatment effects is 


a, >> ) ~ Qi ie + ds >> z Qi. + ds), a Qi. 
+ d.>. Qi.. + ds), Q., : 


(4.12) 


The variances of the estimates of various elementary treatment comparisons are 
given in Table II. 

Case (iii): Let A be a Latin Square type of design with parameters v. = n’, be , 
ro , ko, 2, Ave. The association scheme of the design is determined by the rows 
and columns of the n X n array in which the treatments of the design are ar- 
ranged. If the jth treatment in the ith row of this array is numbered as 
(i — 1)n + j, it is easy to verify that 


(4.13) AA’ = (A;\A,), = I(n) @ (A, — As) + E(n) @ As, 
where 
A, (re — Aw)I(n) + AvE(n), 


A, (Aw — Av) I(r) + AvwE(n). 


Hence 
AA’ = I(n) ® [(r2 — 2w + Aw) I(r) + (Av — doo) E(n)] 
+ E(n) @ [(Aw — Aw) I(r) + AvE(n)]. 


As in case (ii), after some simplification, it can be shown that the coefficient 


TABLE II 


Treatment comparison. Number of comparisons. Variance of the estimate. 


lisa (qd # Q') mnv;(v; — 1) 2(d; + de + ds)? 
— bijn(i #7’) mn(n — 1)n 2(d; + d;)o? 
— tigg(i #I',9 4) | mn(n — 1)oi(m — 1) 2(d; + dz + dz + ds)o? 
tirzrglit # 7’) m(m — 1)n*v, 2(d; + de + ds + dy)o? 
tijg (i #1’, gq #7’) | mim — 1)n% (vy, — 1) | 2(di + de + ds + dg + do? 
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matrix C of the reduced normal equations is given by 
kC = ul(n) ®@ I(n) @ I(n) — wf{I(n) ® E(n) @ I(r) 
+ E(n) @I(n) @ I(r)] — wE(n) ® E(n) @ I(r) 
— ul(n) @® I(n) ® E(u) — wlI(n) ® E(n) ® E(n;) 
+ E(n) @ I(n) ®@ E(n,)] — wE(n) ® E(n) ® E(r), 


U1 rk — ey(re — 2dr + Aw), 
Ur Ci(Arz2 — Age), 
U3 C3 + CrA22 ’ 
U4 Co(T2 — 22 + Ao), 
Us Co(Aiz2 — Azz), 
Us = Ca + Coro. 
Let the number triplet (ijq) denote the ith treatment, which replaces the 
treatment numbered (7 — 1)n + 7 of A in N. Also, let t;;¢ and Q;;, denote the 


effect and adjusted total of the treatment (ijq). Using (4.14), the reduced 
normal equations can be written as 


(4.16) KQijq = Urbisg — Urtbig + bj) — Ust..g — uli;. — us(t;.. + 65.) — wl... , 
where um, U2, ---, ete., are as in (4.15) and é;.,, --- , ete., have their usual 
meanings. Taking the additional equations ¢... = 0, we can solve the equations 
(4.16) uniquely and get the solution as 


(4.17) bse = AQ i ig + d2(Qi.g + Q. ic) + dQ; ;. + d( Q,.. + Q.;.) + dsQ..¢ , 


where 


d, = k/w, dz = ku2/[uj(u — nue)), dz = kug/[us(u, — vyua)], 


TABLE III 





Treatment Comparison Number of comparisons. Variance of the estimate. 


lise _ tise (q # q) n*v;(v; — 1) 2(d; a 2d: + d;)a* 

| tig — tizrg OF bijg — tirja 2n?2(n — 1), | 2(d, + dz + ds + dy)o* 
G #i',j #7’) 

lijg — tijrg’ OF bijg — tirjq 2n?2(n — 1)vi(vi — 1) | 2(d, + 2de + ds + dy + 2d;)o* 

| @ 0,5 %j,¢9# 7) 
. | tiga — bergrelt #7’, 7 #7’) n*(n — 1)%, | 2(d, + 2d, + ds + 2d,)o? 
5. | tig — tinge li 4 U,5 FI, n(n — 1)oi(v1 — 1) | 2(d; + 2de + ds + 2d, 
| + ds)o* 








P. V. RAO 


dy = {k/(v, — nue — vytlg — Nvys)}{[U2( Ue + nus)|/[ui(u — nus)] 
+ [ws( ue + vyus)|/[us(uy — vyts)] + Us/r1}, 


ds = {k/(uy — 2nwe — n°us)}{[Que( we + mus)|/fur(u — nue)] + us/rwu}. 


Hence the adjusted treatment sum of squares is given by 


bY UG t dL Viet VUES) +a VL Mi. 


+d(> G..+ > ¢;,) +4 >¢..- 


(4.18) 


The variances of the estimates of various elementary treatment comparisons 
are given in Table III. 
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notice that NN’ of Lemma 3.1 will be simplified with Ny = O(v,  b,) or No = 
E V1 x by Bs 
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ON A LOCALLY MOST POWERFUL BOUNDARY RANDOMIZED SIMILAR 
TEST FOR THE INDEPENDENCE OF TWO POISSON 
VARIABLES' ? 


By Mowamap SaLauuppIn AHMED* 
University of California, Berkeley 
0. Summary. By definition (X, Y) is a bivariate Poisson vector if(X, Y) = 
(X* + U, Y* + U) where X*, Y* and U are three independent Poisson vari- 
ables with, say, respective expectations a, b and d. 
Let (X,, Y,)n = 1, 2,---, N be independent observations on a bivariate 
Poisson vector (X, Y). It is shown that no test for the independence of X and 


Y can be both boundary randomized similar in a and b and also uniformly most 
powerful. However, a test of the form 


1 N 
¢go(Z | s,t) = j » according as Z = 7 Ye ko( N, 8, t) 
) 


( 


iven S = sand T = t, where, 
’ 


N 
a X = S, 


n=l 


n=1 


is boundary randomized similar and locally most powerful. Using a lemma on 
the convergence to a Normal probability distribution function of the conditional 
probability distribution function of )°%., (X, — sN’)(Yn — tN’) given 
S = sand T = t, asymptotic formulae for the values of the ko(N, s, t) corre- 
sponding to a given level of significance are derived. In addition, it is shown that 
the asymptotic power of the test can be obtained from an approximation to a 
Normal probability function and that, in case instead of ko(N, s, t) its value 
calculated from the asymptotic formulae is used, the modified test is asymp- 
totically locally most powerful in the sense of Definition 2. 

To extend the domain of application of this test, we replace U in the definition 
of the bivariate Poisson vector by another random variable W also taking non- 
negative integral values according to the probability function 


PW =wle fs f(t) dt 
0 


w! 
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where f(/) is any continuous probability function satisfying 


0 "et (ot)” 
Og Jo w! 


f(t) dt -/[ Biker SSE) | seven 

o=0 0 dc \ w! $ leao 
o = 0, w = 0, 1, 2, --- and fj tf(t) dt < ~. In this way a class of bivariate 
probability functions is obtained such that for every member of this class also, 
regarded as a probability function of the random vector (X, Y), the locally 
most powerful boundary randomized similar test for the independence of the two 
random variables X and Y is the same as the one given in the Poisson case. 


1. Introduction. In recent years the application of stochastic processes to 
problems in biology, physics, etc., has created a demand for new tests of in- 
dependence. The present study is an attempt in this direction. 

Under the hypothesis of independence we assume that the random variables 
X and Y are Poisson variables. The first problem is to define a bivariate Poisson 
vector (X, Y) when the Poisson variables X and Y are not independent. In 
Section 2 we give a definition of a bivariate Poisson vector. Equivalent defini- 
tions of a bivariate Poisson vector have been given earlier by several authors 
following different lines of attack (see Teicher [10], p. 2 and Loéve [5], p. 84). 
Also, we consider some characteristics of the bivariate Poisson vector. In particu- 
lar it is shown that the two Poisson variables X and Y are independent if and 
only if they are uncorrelated. In Section 3 a test for the independence of two 
Poisson variables is obtained and furthermore, the properties of the test, already 
enumerated in the Summary, are proved. Section 4 discusses the extension of 
the domain of application of the test of independence for two Poisson variables. 
An example is given for which the locally most powerful boundary randomized 
similar test of independence is not of the form obtained in the case of a bivariate 
Poisson vector. 


2. The bivariate Poisson vector. The simplest definition of the bivariate 
Poisson vector is 


DeFIniTION 1. A bivariate random vector (X, Y) is a bivariate Poisson vector 
if 
(X, Y) = (X* + U, Y* + UV), 
where X*, Y* and U are independent Poisson variables. 
If the Poisson variables X*, Y* and U have respective expectations a, b and 


d, then the probability function of the bivariate Poisson vector (X, Y) is given 
by 


, ; ~ (atb+d) > (d/ab)* 
1 X= awl awe OOM 7 eats. . i... aa 
” : oe een nao (x — u)lul(y — u)!’ 


where u* = min(z, y). We shall refer to the system of parameters used in (1) 
as the system (a, b, d). A direct consequence of Definition 1 is 
THeEoreM 1. Jf (X, Y) is a bivariate Poisson vector, then, 
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(i) the marginals X and Y are Poisson variables; 

(ii) the correlation between X and Y is nonnegative; and 

(iii) the Poisson variables X and Y are independent if and only if the correla- 
tion between X and Y is zero. 

Let (X,, Yn)n = 1, 2,---, N be independent observations on a bivariate 
Poisson vector (X, Y). Henceforth we shall use the notation 


N 
x = (Xi, ¥1, Xs, V2, +--+, Xa, Ya); Am Joka 


n=1 
We also introduce the system (a, b, c), where d = abc. With this notation we 
have 

THEOREM 2. For the system (a, b, c), 

(i) the vector statistic (S, T) 1s a vector sufficient statistic for (a, b) whatever be 
c; and 

(ii) af c = 0, the vector sufficient statistic (S, T) is also a vector complete suffi- 
cient statistic for (a, b). 

Proor. The probability function of the bivariate Poisson vector (X, Y) 
corresponding to the system (a, b, c) is easily obtained by substituting d = abc 
in (1). Accordingly, the joint conditional probability function of (X,, Y,)n = 
1,2,---,N, given S = s, T = t, is 


a(x|st)=TL¥ : 


S—s,Txt n=i Sm (Xn — 4) !ul(Y, — u)!’ 


which is independent of a and b. This proves the sufficiency in (i) and (ii). 

The proof of the completeness in (ii) depends on the fact that, when c = 0, 
the random variables S and T are independent Poisson variables with respective 
expectations Na and Nb. Now consider 


> = g(s, t)e"-** (Na)* (Nb)‘ = 0, 


s=0 t=0 s! t! 


for all nonnegative a and b. Viewing this series as a convergent power series in 


a and b, we see that g(s, 4) = 0 almost everywhere. This proves completeness 
(see Lehmann and Scheffé [4], p. 311). 


3. Test for independence. In this section we shall obtain a locally most power- 
ful boundary randomized similar (LMP BRS) test for the independence of two 
Poisson variables and prove the nonexistence of the uniformly most powerful 
boundary randomized similar (UMP BRS) test. Also, we shall obtain the asymp- 
totic distribution theory of the test statistic under the hypothesis and prove 
certain asymptotic properties of the test. Finally we shall discuss an application 
of this test. 
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3.1. LMP BRS test for independence. In the system (a, b, c) the test for the 
independence of two Poisson variables is equivalent to testing the hypothesis 
c = 0 against the alternative c > 0, whatever be a and b. Let (X,, Ya)n = 
1, 2, --- , N be independent observations on a bivariate Poisson vector (X, Y) 
with parameters (a, b, c). Finally, let po(x) and p.(x) denote the likelihood 
functions respectively under the hypothesis and the alternative. 

In order to get rid of the unknown (nuisance) parameters a and b occurring 
in po(x) and p.(x) we shall use instead of these the conditional likelihood func- 
tions go(x | s, t) and g(x | s, t) respectively, when S = s, and T = t, which are 
obtainable from (2). The geometrical picture is of surfaces for each set of values 
of (S, T). To obtain the LMP similar test for the independence of X and Y, 
the method is to find a test function ¢(x | s, t), 0 S ¢(x|s, t) S 1, defined on 
the hypersurface given by S = s, T = t, such that whatever be a and b, 


> v(x | 8, t)go(x | 8, t) = a, 0<a<l 


and 


0 ; 
ac z; ¢(x 8, t)ge(x | 8, t) \omo 


is @ maximum, where a is the preassigned size of the test. The summation is 
over the sample points on the hypersurface given by S = s, T = t. The mean- 
ing of this test function is that when x is observed on the hypersurface defined 
by S = s, T = t, the hypothesis of independence is rejected with probability 
¢(x | 8, t). This procedure to obtain the LMP similar test is valid due to the 
conclusions obtained in Theorem 2 (also see Lehmann and Scheffé [4], pp. 311, 
318). The solution to this problem is given (see Neyman and Pearson [7], p. 10) 
by 

1 ; 

¢(x|s,t) = <y according as = log ge(x | 8, t)|mo S ky. 
0 


Substituting the expression for q.(x | s, t) from (2) and after simplification of 
the required differentiation we obtain that the condition 


0 . 
— log q(x | 8, t) |m k 
ac g qe(x | ) \cm0 1 


is equivalent to the condition 
~ > me 
(3) < ko = ko(N, 8, t). 
In particular we notice that the test ¢(x | s, t) is a function of x through Z only. 


Therefore, the test will be denoted by ¢o(Z | s, t). 
Since the sample space of the random variable Z is discrete, to obtain similar 
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test it is sufficient to use randomization only on the critical value ko of Z; hence 
the name BRS test. We remark also that with this agreement there is a one-to- 
one correspondence between the BRS test and the BRS region. Henceforth we 
shall use these terms interchangeably. Finally we have, 

TueoreM 3. If (X,, Y,)n = 1, 2,--+, N are independent observations on a 
bivariate Poisson vector (X, Y) with parameters (a, b, c), then the LMP BRS 
test for the independence of the two Poisson variables X and Y is given by the rule: 


| 
(4) go(Z\s,t) = <yo accordingas Z a ko = kN, s, t). 
0 


The values of ko and yo are determined in such a way that 
(3) > ¢o(Z 8, t)qo(Z | 8, t) =a. 


3.2. Nonexistence of the UMP BRS test. Let g) and ¢ respectively be the 
LMP BRS and the UMP BRS-tests of size a for testing c = 0 against c > 0. 
Let B(¢o ; a, b, c) and B(¢; a, b, c) respectively be the power of the tests go and 
¢ against the alternative (a, b, c). Knowing that (S, 7) is a vector sufficient 
statistic for (a, b) we can write for the test ¢o , 


(6) B(go ; a, b,c) = > B B(¢o ; 8, t, c)r(s, t; a, b, c) 
s,t 


where 8(y; s, t, c) is the conditional power of the test y on the hypersurface 
defined by S = s, T = t against the alternative c, and r(s, t; a, b, c) is the prob- 
ability function of the vector statistic (S, 7) when the parameters have the value 
(a, b, c). Similarly for the test ¢ we have 


(7) B(y; a,b,c) = 7 B(¢; 8, t, c)r(s, t; a, b,c). 
e,t 


TueoreM 4. There exists no UMP BRS test for testing the independence of two 
Poisson variables X and Y based on the independent observations (X,, Y,)n = 
1, 2,---,N made on the bivarite Poisson vector (X, Y). 

Proor. Since a UMP test is also a LMP test 

0 0 
(8) — B(g ; a,b,c) | no = — 


(g; a, b,c) |-no. 
ac 3e PS? 7 a 


But we can rewrite (8) using (6) and (7) as follows: 
+ {B’(go 3 8, t) + Boo; 8, t, O)g(s, t; a, b)}r(s, t; a, b, 0) 
(9) t 
= -. {B’(o; s, t) + Ble; 8, t, O)g(s, t; a, b)}r(s, t; a, b, 0) 


s,t 


where 


’ 0 
a’ (;8,t) = =, Alvis, t, c) |emo ; 
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g(s, t;a,b) = 5 log r(s, t; a,b,c) |eno. 


Furthermore, since the tests gp and ¢ have Neyman structure, we have, for all 
possible values of s and f, 


B(¢go ; 8, t, 0) = B(y; 8, t,0) = a. 


Therefore, we obtain from (9) that 


>. {B’(¢o ; 8, t) — B’(¢; 8, t)}r(s, t; a, b, 0) = 0. 
s,t 


This fact, together with the completeness of the vector sufficient statistic (S, 7’) 
for (a, b) whenever c = 0, proved in Theorem 2(ii), implies that 


h( go, ¢; 8, t) _ B’ (go ; 8, t) _ B'(¢; 8, t) = 0 


almost everywhere. In fact, more is true, viz., ¢ = ¢ almost everywhere. The 
proof is by contradiction. Simple calculation shows that 


(10) h(¢go, ¢; 8, t) = A{ >> @P(ZA — B) — >> oP(ZA — B)}, 


where the summation is taken over the sample points on the hypersurface de- 
fined by S = s, T = t and 


N 


sates psebinese. a 


n=l n=l 


B 2, Ps. 


S=s,T=t 


Using the form of the test go given in (4) we obtain from (10) that 
h(go, 938, t) > A*(kA — B)(LaP — DeP) =0 


which contradicts (8). Hence g = ¢ almost everywhere. 

Therefore, to show that there exists no UMP BRS test it is enough to show 
that go is not a UMP BRS test. We will show this numerically. For a specific 
alternative let b = b, a> 0 and c — = in such a manner that in limit a = 0 
and abe = d. Then using Definition 1 we can see that X, and Y, — X, are inde- 
pendent Poisson variables with respective expectations d and b. The most 
powerful conditional test against the alternative described above when S = s, 
T = tis given by 


1 . 


g(x |s,t) = <y according as II Yall(Ya — 
0 n=l 
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Let N = 2,8 = 3,t = 4anda = }. Then the test go(x | 3, 4) is 1 or 0 according 
as x = (X,, ¥1; X2, Ye) belongs to the set {(3, 4; 0, 0), (0, 0; 3, 4), (3, 3; 
0, 1), (0, 1; 3, 3), (2, 4; 1, 0), (1, 0; 2, 4)} or not. Also the test ¢(x | 3, 4) is 
such that it is equal to 1 if x belongs to the set {(3, 4; 0, 0), (0, 0; 3, 4)}, it is 
equal to y = y% if x belongs to the set {(3, 3; 0, 1), (0, 1; 3, 3), (2, 3; 1, 1), 
(1, 1; 2, 3)} and it is equal to zero for all other points on the hypersurface de- 
fined by s = 3, t = 4. We can easily check that the size of both these tests is 
:. Finally, simple calculation shows that the power of the tests gp and ¢ respec- 
tively are 3$ and $$, showing that the test ¢ is not the UMP test. This com- 
pletes the proof of the theorem. 


3.3. Asymptotic distribution of the test statistic under the hypothesis. Define 


Sv = N*> (X, —a), 


n=l 


Ty = N*> (Y, —b) 


n=l 


N 
Zu = N*> (X, — a)(Y. — 5). 


n=l 


It follows from the central limit theorem for vectors (see Cramér [1], p. 316) 
that if N — ~ then 


> tuZwr+itSn+inT yw —4(abu2 +at2 +bn?2) + tuZ_ +itS_ +inT 
(11) eo —e€ sxeo* *, 


for all u, £, », where new random variables Z,,, S,, and T.. are so defined that 
the equality sign holds. Using the notation, 


E(e™** | S, = s, Ty = t) = ox(u, 8, t) 


we obtain by rewriting (11) 


(12) [et on (u, s, t) dPy(s,t) > [ett oalu s, t) dP..(s, t) 


for all u, £, », where P;(s, t) is the probability distribution function of the bi- 
variate random vector (S;, 7). Furthermore, if we define 


Q,(u, A) = i] w,(u, 8, t) dP,(s, t), 
A 
then clearly Q,(u, A) is a complex measure satisfying 
supz,4|Q.(u, A)| S 1 for k = 0,1,2,---. 
With this notation (12) can be written as 


(13) few dQy(u, s,t) — [ete dQ.(u, s, t) 


for all u, , 7. 
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If f(s, t) is any bounded continuous function (see LeCam [3], p. 28) then 
(13) is equivalent to saying that 


Is 8, t) dQy(u, s,t) | f(s, t) dQ.(u, s, t) 


for all uw. Finally, reintroducing «,(u, s, t) we obtain that 


(14) | Hs, don(u, 8, t) dPy(u, s, | ss, t) wa(u, 8, t) dP.(u, s, t) 


for all u. This indicates that wy(u, s, t) might tend pointwise to w,.(u, s, t). 
Such is the case as will follow from Lemmas 1 and 2. 

Lemma |. /[f the distribution function of the random vector (Zy , Sy , Tw) con- 
verges to the distribution function of the random vector (Z..,S«,T.2) when N — ~, 
and if the family of the conditional characteristic functions | wy(u, s, t)} is uniformly 
bounded and is equicontinuous for all u, s, t and uniformly so for u, s, t bounded 
then, 


wy(u, 8, t) > w,(u, 8, t) 


as N — ~ for all u, s, t and uniformly so for u, s, t bounded. 
Proor. According to Ascoli’s theorem (see Graves [2], p. 122) we can find a 
subsequence wy,(u, 8, t) and a continuous w(u, s, t) such that 


(15) wy,(U, 8, t) — w(u, 8, t) 


for all u, s, t and uniformly so for u, s, bounded. Replacing N by N;, in (14) we 
obtain 


(16) fs 8, t)wy,(u, 8, t) dPy,(s, t) > [ #0, t)wao(u, s,t) dP.(s, t) 
for all uw. Let us consider the equality 


| $s, tho u, 8,t) dPy,(s,t) = | Hs, tow, 8,0) dP..(s, t) 
+ [s 8, t)w(u, 8, t) di Py,(s,t) — P.(s, t)} 


— | Hs, t) {wy,(u, s,t) — w(u, s, t)} dPy,(s,t). 
Using (15) and the fact that f(s, t) is bounded, we can see that 
| $3,0) fon u, 8,t) — wlu, s,t)} dPy,(s, t) +0 


for all wu when N, — ~. Also due to the Helley-Bray theorem (see Loéve [6], 
p. 182), since P,,,(s, t) — P..(s, t) for all values of s, t, 
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| Hs, t).(w, 8,0) d{Puy(s, t) — Pols, t)} +0 


for all wu when N;, — ~. The above two conclusions, together with (17) implies 
that 


(18) i 8, t)wy,(u, 8, t) dPy,(s, t) > | ts, t)w(u, s, t) dP.(s, t), 


for all u. Comparing (16) and (18) we obtain that 


| fs, talus, 0) dP..(s,t) = | #ss)o0lw, 8, t) dP.(s, t) 


for all values of u which implies that 
(19) w(u, 8,t) = wo(u, 8, t) almost everywhere. 


The “almost everywhere” may be dropped since the functions under considera- 
tion are continuous. This proves that every convergent subsequence of 
fww(u, s, t)} has the same limit. But more is true, viz., 


(20) wy(u, s, t) > w(u, 8, t) 
for all u, s, t and uniformly so for u, s, t bounded. Suppose that 
wy(u, 8, t) + w(u, 8, t). 
Then there exists subsequences N, and Ny such that 
wy, (u, 8, t) + w’(u, s, t) = lim sup wy(u, s, t) 
and 
wy (u, 8, t) > w”(u, s, t) = lim inf wy(u, s, t) 


for all u, s, ¢and uniformly so for u, s, t bounded, where w’(u, s, t) and w”(u, s, t) 
are not equal. In the proof given above if we replace the sequence N, by the 
sequence N, we obtain corresponding to (19) that 


for all uw, s, ¢ and similarly we obtain that 


wo” (wu, 8, t) = wu, 8, &), 
contradicting our assumption that w’(u, s, t) and w”(u, s, t) are not equal. 
Therefore, (20) is true. This completes the proof of Lemma 1. 
Lemma 2. The family {ww(u, s, t)} is uniformly bounded and is equicontinuous 
for all u, s, t and uniformly so for u, s, t bounded. 
Proor. Since wy(u, s, t) are conditional characteristic functions it follows that 


ww(u, s,t)} S 1. 
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We remark that if Sy = s, then the X,’s, n = 1, 2,---, N are N-nomially 
distributed with probability of falling in any one of the N classes being N 
and having parameter k = N's + Na which is equal to the number of trials. 
Similarly, if Ty = t, then the Y,’s,n = 1, 2,---,N are N-nomially distributed 
with the same probability of falling in any one of the N classes and with parame- 
ter 1 = N*t + Nb. If s and ¢ are changed to s’ and ¢’ respectively, such that 
s’ > sand?’ > t, then k and / will change respectively to say k + k’ andl + I’ 
and correspondingly X, and Y, will change to X, + X, and Y, + Y., where 
X;, and Y;, are completely independent N-nomials of the above type but with 
respective parameters k’ and lI’. Let 


N 
Zn(k, tl) = N*>> (X, — a)(Yn — b). 


n=l 


It is easy to see that 


Qv = Zr(k +k’, l+U) — Zy(k, l) 
N N 


2 | X. + X, — a)(Y¥. + Y. — 6) — ND. (X. — a)(Y. — 6) 


n=l 


= NX (Xx Y.+N > X.(Y. —b) + N?D: Xiy%. 


n=l n=l 


In order to prove the required equicontinuity, we shall first show that for an 
arbitrary « > 0, there exists an e’(«, s, t, u, a, b) such that if 


s’—s<e and t' —t<é 
then for all NV 
\Qx| < $e 


where 
\Qn| = | Bte*7*"[He°" — 1) | k, k’, 1,0} 
| E{le™** — 1) |k, k’,1,0}| 


5 |B) indy 402 Q% [ eiment(y e) ae | ky RLU | 
S |u| |Z{ Qn | k, 1, U}| + ful? | B{Qu | k, k,1,0}). 


By direct calculation it can be shown that 


E{Qw | k, k’, 1, UV) = N“*(s't’ — st 
and 


E{Qy | k, k’, l, UW} = (N — 1)N“{b(s’ — 8) + a(t’ — 8} 


+ (N —1)N*(s't! — st) + N“(s't’ 
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Therefore, with a suitable choice of ¢’ = e’(e, 8, t, u, a, b) 
\Qy| S 4 forall N. 
L.e., in terms of w(u, s, t) we have shown that if 
s’—s<e and t'—-t< é, 
then 
lww(u, 8’, t’) — ww(u, 8, t)| S $e for all N. 
For any two points (s;, 4) and (s, &) satisfying the conditions 
ls) — &| << e and | — b| < 
we consider an auxillary point 
(s, t) = (min{s,, so}, min{t, , &}) 
and then using the triangular inequality we obtain that 
lww(U, 8,4) — ww(u, Ss, t)| S lwow(u, 1, th) — ww(u, 8, t)| 
+ |ww(u, 8, t) — ww(u, 8, t)| Ste + 4¢ = € forall N. 


This completes the proof of the equicontinuity of {wy(u, s, ¢)} for all wu, s, ¢. 
It is easy to see that for bounded values of u, s, ¢ the family {wy(u, s, t)} is uni- 
formly continuous. The above type of reasoning was necessary to prove equi- 
continuity because only the sum of the multinomials with the same number of 
events is again a multinomial. 

Therefore, in view of these lemmas, the needed asymptotic characteristic 
function is w.(u, 8s, t). We can easily check that 


Wae(u, 8, t) = exp{—abu’/2}, 


which is a characteristic function of a Normal variable with mean zero and vari- 
ance ab. Finally, denoting the probability law of a random variable W by £(W) 
we can summarize the results obtained above as follows: 


£{(ab)*Zy| Sv = 8, Ty = 8 


N N 
= £{(Nab)*>- (X, — a)(Y, — b) | 5 (X, — a) = N's, 
n=l n=l 


> (Ys — b) = N*# > N(0, 1) 


n=l 


for all values of s, t and uniformly so for s, t bounded. 
But to make use of this result we have to know the values of a and b. We recall 
that 


N 
= NX and > Y, = NY. 
n=l 
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We remark that if SN S, then the a n be me . N 


are N-nomially 
distributed with probability of falling in any one ol the N classes being N 
and having parameter k N’s Na which is equal to the number of trials. 
Similarly, if 7's t, then the Y,’s, n Ry ae , N are N-nomially distributed 
with the same probability of falling in any one of the N classes and with parame 
ter 1 = Nt + Nb. If s and ¢ are changed to s’ and @’ respectively, such that 
s’ > sand?’ > ¢, then k and / will change respectively to say k + k’ andl + I 
and correspondingly X, and Y, will change to X, + Xi, and Y, + ¥‘, wher 
X;, and Y;, are completely independent N-nomials of the above type but with 
respective parameters k’ and I’. Let 


Zy(k, l 


It is easy to see that 


Ws Zn kh T k’, 


> tx. 


N N 
Ww >, (xX. — e)¥. + ND. X.(¥Y. — 6) + ND. X.Y.. 


In order to prove the required equicontinuity, we shall first show that for an 


arbitrary « > O, there exists an e’(e, s, f, u, a, b) such that if 


, Sa a , , 
S —-S&8< and t . € 


then for all A 


where 
Qe) = | Eyer "(ee e* — 1) 
ow ilk. kT} 


UuQy — Fu Qs gro ae 
Jo 


S jul |E{Qu | kk’, 1,U}| + ul? | 

By direct calculation it can be shown that 
E\Qwn | k, k’, 

and 


E}Qy | k, k’, U4 
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Therefore, with a suitable choice of «’ = e’(e, 8, t, u, a, b) 
Qy| S 46 forall N. 
I.e., in terms of w(u, s, t) we have shown that if 
and t’/—t< é, 
then 
wy(u, 8’, t') — wy(u, s,t)| Ss sé for all N. 
lor any two points (s,, t) and (s2, &) satisfying the conditions 
S$: — &| < e and |t, — b| < é 
we consider an auxillary point 
(4, 6) (min}s; , so}, min{t, , te} ) 
and then using the triangular inequality we obtain that 
,t) — wn(U, So, te)| S jwn(u, &, th) — ww(u, 8, t) 
+ |wn(u, s, t) — ww(u, 82, te)| Se + 4€ = € forall N. 


This completes the proof of the equicontinuity of {wy(u, s, t)} for all u, s, t. 
It is easy to see that for bounded values of u, s, t the family {ww(u, s, t)} is uni- 
formly continuous. The above type of reasoning was necessary to prove equi- 
continuity because only the sum of the multinomials with the same number of 
events is again a multinomial. 

Therefore, in view of these lemmas, the needed asymptotic characteristic 
function is w,(u, s, t). We can easily check that 


Weo(U, 8, t) exp} —abu'/2} 


which is a characteristic function of a Normal variable with mean zero and vari- 
ance ab. Finally, denoting the probability law of a random variable W by £(W) 
we can summarize the results obtained above as follows: 


£{(ab) *Zy | Sy = 8, Ty = tj 


N N 
¢{ (Nab) ve ti, —a)(Y, — b) 7 «x. —a) = N's, 
aa 


n=l 


N 
> (Y¥, — b) = N*# — N(O, 1) 
n=l 


for all values of s, t and uniformly so for s, t bounded. 
But to make use of this result we have to know the values of a and b. We recall 
that 


N 


a NX and ) Y, = NY. 
. nal 


1 
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Using Slutsky’s theorem (see Cramér [1], p. 254) it follows that the sequence 


(NXY)"*)>, 


— 


N*t} 


also have the same limiting distribution N(0O, 1), for all values of s, t and for 
s, t bounded the convergence is uniform (see LeCam [3], p. 24). In what follows 
we shall use the notation 


This gives the following 
THEOREM 5. When N 


Q, 1 


for all s, t and the convergence 18 uniform for s, t bounded. 

In order to use the test of independence developed in Section 3.1 we must find 
first the values of ky and yo satisfying (5). In the absence of the knowledge of 
the exact distribution function of Z we propose to use instead the test statistic 
Z, and its limiting distribution obtained above. For a given level of significance 
a, first we find the value of k such that 


(Yr) ¢ ~ du 


o— or 


and then the modified test is given by the rule: 


\' 


It is important to note that the condition of equicontinuity in the proof of 


2(Zn\8, ft according as Zy 


Theorem 5 is not only sufficient but is also necessary in a certain sense. In 
applications, s and ¢ are in fact functions of N Say Sy, tn, which will tend to s 
and t when N > ©. In the calculation of probabilities, etc., though we propose 
to use the asymptotic distribution but instead of s and t we have to use the values 
Sy and ty respectively. Equicontinuity will ensure us that the values calculated 
with this substitution do not differ too much from the true values. 


3.4. Asymptotic power function. The power of the modified test against the 
alternative (a, b, c) is 


B(a, b,c) = [ P{2, > k |e, t, a*, 6*, d} dP vs, t | a*, 6*, d) 


where 


at d, b* 
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It is interesting to note ahead that we cai compute the asymptotic power of 
the modified test without using the conditional probability distribution fune- 
tion of Zy given Sy s, Ty = t. If we define 


“ 
= N yd {(X, — a*)(Y, — b*) — d}, 
n=l 


then it follows from the central limit theorem that under the alternative (a. b. ec); 
ry* ‘ ) 
21 £(Z x > N Q, o 


where o a*hb* + d +d. The decisive step now is to be able to represent 
y_ ° - gy*® . 
Zw in terms of Zy . It is easy to see that 


where 


and similarly, 
Ty = N*>°(Y, — b*) = Ty — dN’ =t — dN’. 
Under the alternative (a, b, ¢) when N 
4 


in probability. Furthermore, since s and ¢ are finite constants we have from (22) 


that 
— yo + dN’ +h 


where, 
a*) dN’ — sd — td + @“N. 
yt ~ (ab)?. 
| inally we obtain 
Z, ~ (ab) *(Zy + dN’ + hi), 


and therefore, asymptotically, 
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B(a, b,c) ~ / PiZx + dN?+h> k(ab)? | s, t, a*, b*, d] dP y(s, t, a*, 


0 
r oot . rt a i 
= P[Zy + dN’ +h> k(ab)’ | a*, b*, d) = (2r)” [ 
“ {k(ab) 4—dNi—hjo 
[t is easily seen that B(a, b,c) — 1 as N — = for fixed alternative (a, b, c 
When N — ~, our interest is only in the values of ¢ which are in the neighbor- 
hood of c = 0. If the convergence in (21) should hold also when c — 0, it is 
necessary to have uniform convergence of the law in ¢ for given fixed values of 
a and b. A sufficient condition for this uniformity of the convergence is (see 
Parzen [9], p. 38) 
— b* ae < M 


where JM is independent of c. The existence of M can be easily verified. If in 
particular 
cN* — y(ab) 
we obtain that 
‘(ab - dN’ — hie » ai 'h:( ab 


/ i ab 


We summarized the results proved in this section in 
THEOREM 6. Let (X,,. Y,)n 1,2,---. N be inde pe ndent observations on a 
bivariate Poisson vector ( a - . Then uf the test 


\! 
\o 


the ance pendence F i Poisson ariable Ss X and Y: whe re I 78 chose i 


according as on 


ths 


Asymptotically LMP BRS test. Since the small sample distribution theory 


; ; ; 
> test statistic is not known, in practice we shall use, as suggested earlier. 
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1e asymptotic theory. If such is the case, we are in fact approximating a LMP 
S test. It is of interest to know whether this approximation of the LMP 
s also asymptotically LMP BRS test in the sense of definition 2. 
ie LMP BRS test of preassigned level of significance a was given in (4) in 
terms of Z. In terms of the random variable Zy the test is: 


I 


0 


according as Zy & k*(N, X, Y) = k* 


* 


and whatever be a and b, the constants y* and k* are chosen to satisfy 


= ¢*(z)pzy(z; a, b, 0), 


z 


where pz,(z; a, b, c) is the probability function of Zy for the system (a, b, c). 
The power of this test is 


* 
Brig*; a, b,c} = >> ¢*(z)pz,y(z; 4, b, c). 
z 


The corresponding asymptotic test is given in (23) and it has the size 


ay = >, So(z) pry (23 a, b, 0) 


Zz 


and for the alternative (a, b, c) it has the power 


Build: a, b, c= > ¢ 2 Pzy(z; 4, b, Cc) 


Zz 


DEFINITION 2. The test ¢ is an asymptotically LMP BRS test of size a if 


. iD « ' ‘Oo 
N° — Srl g; —= ~ By(¢*; a, bec 
de { ldc 
when N » «, whatever be a and b. 


TuHeoreM 7. The test & is asymptotically LMP BRS test of size a. 
Proor. To prove (i) consider 


a 


=|> J — ¢*(z)}pz,(z; a, 6,0 


< > I(k, kw) pz,y(z; a, b, 0), 


where /(k, ky) is the indicator of the closed interval with end points k and ky . 
Let the probability distribution function of Zy and Z, be denoted by Fy and 
F, respectively where we know already that F, is N(0, 1) and hence is con- 
tinuous and differentiable. Therefore, given ¢ there exists by Theorem 5 an 
N(e«) such that 
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— F.| < «, 
N = N(e). This implies that tl 


here is an 7 dependi: 


n approaches zero as ¢ approaches zero and 


where k’ is a suitable point between / and + » and simil: ‘ly k” is between 
and 7. Finally let « > 0, then follows also that n >, he) 


finiteness of F.(k’) and F.,(k”), (i) is proved 


: 
ii) let us consider 


ce using the 


0, 
log Pp 


ar 


-t ’ ] 
irtzZ immequanhty 


ab + ab os 3 


x because of (1) and the boundedness of (ab 


ilso satisfied. This completes the proof of the theorem. 


3.6. Applications. This testing problem arose in connection with the work on 


beetles which is being conducted by Professor Park, etc. (see Neyman, Park, 
Scott [S}. p. 75) at The University of Chicago. 
Suppose that a square tray, in which a thin layer of flour is spread, is divided 


juares of the same size by drawing hypoth tical lines parallel to the 


sides of the square tray. A very large numbet of beetles were kept in it and 
after sufficient time had « lapse d, the number of male ai d female beetles in each 


subsquare was noted. The problem was to study if in the tray, the distribution 
of male 


ind female beetles was independent of one another. 
If we 


assume that each pair of beetles distributes itself inde pend ntly of the 


other pair and uniformly so in the tray, then the stochastic derivation of th 


bivariate Poisson vector (see Loéve 5). p S4 shows that it Is applicable 10 the 
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present situation. Several similar situations could be easily conceived where 


the bivariate Poisson vector is applicable ; 


4. A class of bivariate random vectors and test of independence. In this sec- 
tion we describe a class of bivariate probability functions such that for any of 
these bivariate probability functions regarded as the joint probability function 
of the random variables X and Y, the LMP BRS test for the independence of 
X and Y is the same as the one obtained in the case of bivariate Poisson vector 
Also, an example is given for which the LMP BRS test of independence is not 
of that form. 


4.1. A class of bivariate random vectors. In Definition 1 the random variable 
Z is a Poisson Variable with expectation d. Let f(t) be a continuous probability 
function and define a random variable W taking nonnegative integral values by 
the probability law 


PIW 


where ¢ plays the role of a scale parameter. In particular, if o 0, the random 
variable W is degenerate at zero. This leads to 

DerFINITION 3. The bivariate random vector (X, Y) is a bivariate Poisson* 
vector with respect to the continuous probability function f(t) if (X, Y 
(X* + W, Y* + W) where X* and }* are two independent Poisson variables 
and W is a random variable independent of X* and Y* and having the prob- 
ability function (24 

Using the above definition, we can easily verify that the random variables 
X and ¥Y are independent if and only if o 0 and in case they are independet { 
each is a Poisson variable. If ¢ > 0, the random variables X and Y are dependent 
and moreover the marginals X and Y are no longer necessarily distributed as 


Poisson variables. 


4.2. Test of independence. Under the hypothesis of independence « = 0 and 
the random variables X and Y are independent Poisson variables with respective 
expectations a and b. Furthermore, if (X, , Y,)n = 1,2,---,N are independent 
observations on (X, Y), then (S, 7), is a vector boundedly complete sufficient 
statistic for (a, b) as shown in Theorem 2(i). Therefore, all similar tests for this 
problem have Neym: n structure. The conditional LMP BRS test for inde- 


pendence is given by 


\ ] 
6 


according as 
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We, 
5 — \ ab ) 


o, (X, — w)! wi(Y, — w)! 
~@ 


| e*'(at)"f(t) dt 


/0 o=( 


S k( s; t, N), 


{ 
{ 


. mm rk r " ‘ . . . 
where S = 8, = tand W, = min (X,, Y,). If we assume that f(t) is such 
that 


i. .-teh 5 r 9 a " 
(26) | ‘ae $8) =| : f ‘= Lee) at 
o=( Ooo 


Oo w! alll _ 


for c= Q, n= Q, ie 2 


| #@ dt< « 


then the condition (25) can be reduced to the equivalent form 
= 
= 
Z= > X,Y, =k = kx(s, t). 
and < 
THEOREM 9. Let (X,, Yn)n = 1, 2, --+- , N be independent observations on the 
bivariate Poisson* vector (X,Y) with re spect to the probability function fi t satisfy- 
ing the conditions (26 and (27 . then the LMP BRS te st for the nde pe nde rLce 
of X and Y is quien by 


\! 

<4 according as Z = ko = kK(N, 8, t) 

i, 

Therefore, the theory developed in Section 3.3 is applicable in these situations 


also. 


4.3. Test of independence which is not a function of > *_, X,Y,. 
In Definition 3 instead of the probability function f(t) let us assume that t = 0 
and 1/o with respective probabilities 1 — p and p. Then the probability function 
of (X, Y) is given by 


,Y = y|a,}, p} 


, a’b’(ab)~“ 
= (1 — p)e 


(x — w)! wi(y — w)! 

Also X and Y are independent if and only if p = 0. 
In this case the LMP BRS test. for p 0 against p > O is given by the rule: 
\' 


N Wn Y , s 
). according as ie 2 (xX —— ~ > kK(N, 8, t). 


, 4 


— w)! w! 
0 


Clearly this test of independence is not a function of >°¥. 
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CONFIDENCE INTERVALS FROM CENSORED SAMPLES 


By Max HALPERIN! 


Knolls Atomic Power Laboratory 


1. Summary. Suppose a random sample of size n is drawn from a normal 
population with mean uv and standard deviation o and that the sample has been 
censored either to the right or the left. Suppose the censoring is at a fixed point 


of the distribution or at a pre-specified sample percentage point, or is a combina- 
tion of these two types of censoring. In this paper we present small sample 
bounded confidence intervals for » and o, based on a joint bounded confidence 
region at a confidence level with fixed bound. The limits for u and o so obtained 
converge in probability, as n — ~, to the parameter values. The procedure of 
the paper allows similar results for some other scale-translation families of dis- 
tributions. One such case, which is briefly discussed, is that of the exponential 
distribution with unknown initial point. The somewhat general applicability of 
the procedure mitigates the fact that it is not based on sufficient statistics. 


2. Derivation of results: normal distribution. 

a. Fixed point censoring. In this discussion we assume censoring is to the right; 
the changes if censoring is to the left will be obvious. 

Thus, we consider a random sample of n on N(y, ¢) censored to the right at 


a known number 7. lor m the number ot noncensored observations greater than 
zero, denote the ordered non-censored observations by 


my << me. Os Coe <, 7. 


The number m is a random variable having a binomial distribution with 
parameters ®[( 7 — w)/o] and n, where ® is the unit-normal cumulative distribu- 
tion function. The density of 41, %2, °°* , Xm given mis 


0, elsewhere, 


where ¢ is the unit-normal density. 

On the basis of (2.1) and the distribution of m, we would like to obtain a 
reasonable confidence set on (u, ¢) at a bounded confidence level 8, where 8 is 
specified in advance. By “reasonable”? we mean that the set should be bounded 

providing this is possible ), and that as n > © any confidence limit should 
converge to the appropriate parameter. We consider in detail the case where 
two sided limits on uw and o are desired. 

First we observe that, from the binomial distribution of m, we can get con- 
fidence limits on ®{( 7’ — yw) /o] or 7 for short, at say the 8, confidence level, as 


teceived November 16, 1959; revised January 5, 1961 
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i>, < &, < &,]|. Now suppose that from (2.1) we could make a confidence 
statement, conditional on m, concerning ®; and some other function of uw and a, 
or at least the latter, at say the 6, level or greater for each m. Denote this con- 
ditional confidence statement by C,, . The total probability that both C,, and the 
statement about ®, are true is clearly 


(2.2) > Pr {C,, is true | m} Pr {m}, 


where the sum runs over all values of m such that the associated statement about 
, is true. It then follows that the probability (2.2) is = 8.8, . Since uw and o are 
functions of ®; and whatever other parameter is involved we should then be 
able to get confidence limits on uw and o, as well as other functions, at least at the 
8:82 level; this presumes ®7 and the “other parameter” are a one-to-one trans- 
formation of yu, ¢. The question of optimum choice of 8; and 82 such that 8)6. 8 
is not discussed; for simplicity we take }; Bo = 8B’. 


Turn to the definition of C,, . Suppose we let 


o4 = Pix, — Mh) a| Pr “ 
Then the density of the z,’s given m is 


2.4) ml}, os ea < me < 


0, elsewhere, 


providing m > 0. If m 0, we can obviously only make a useful statement 
about @; ; any statement made about any other parameter must include all 
possibilities in order to be correct and hence will be trivial. Thus if m 0, the 
confidence region cannot be bounded. However, m 0 is clearly a degenerate 
ease for either point or interval estimation of « and o no matter how one goes 
about solving these problems. Hence, we take care of this case ina purely formal 
way to insure the bounded confidence level property. This will be made explicit 
shortly. 

Now, suppose m = | and that we can find numbers r and s (integral or zero 
and a number 6 such that 


(2.5) Os a<r - F = = riri¢g, S35 ei mie Ff, 


where 2 0 and 2m41 1. Of course, s, r and 6 will depend on m. As will be ap- 
parent shortly, we.must require 6 < 3. Before indicating in detail how to deter- 
mine r, s and 6, we turn to the inequalities on » and o separately which are 


on 6 specified by (2.5) can be written (for s > 0,7 < m+ 1) as 


implied by (2.5) and a 8’ confidence interval on &; . From (2.3), the inequality 


(2.6) te —p)/o SP (6&7) S (2, )/a, 


where ® ~~ (a) is the standardized normal deviate exceeded with probability 
1 — a. Since we can write, for example, 


‘1 


4) (4% — p)/o l(a, — T)/o) + [((T — pw)/o| = [(xz, — T)/o] + @ '(@,), 
and & '(é@;) — ® '(@;) < 0, for0 < 6 < 1, it follows from (2.6) that 


(9 


( 
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o > (T — Is) |b (Dr) —® '(6@,)]. 


(2.8) 
r based on x, and of the same form follows similarly. 


A lower inequality for o 
2.8) with respect to 7 shows that it 


Differentiation of the right hand side of 
will be monotone decreasing in ®; providing one has 

29 exp | —3(@ (62, — 6 exp j- aK '(@, i > ©. 
The left hand side of (2.9) is zero at &7 0 and differentiation with respect to 
pP, shows that it is monotone increasing. Thus one has as limits for a, if s > 0, 


1, since 7 — 2, > 7 


r Xm + 


T . < 
2.10 

i Le @ (P, — DP | 6? ; )}. 
Clearly if 7 + 1 s > 0, the lower bound in (2.10) becomes zero, while if 
0, the upper bound is infinite. 


S 


: oO m ee 


Now we turn to the determination of limits for LL. Iirst we observe that 


YY — oe ®,) so that 
Ou 1 . 
—® (@,-) <= 0,if , 
Oa 


> 0, if dr < 5, 


Ou 


oP 1 


= —g (27 


3 exp lip '(,)]° < ©. 


Thus we need to consider three cases, 


P, 
In case (A), (du/dc) < 0, : 0. Consequently the upper limit to yu 
must lie ono¢o ( r ; ; OP, | so that 

([o '(6,) — &'(86,)]. 


Zcka pu 
with respect to @7 it is found that 


Differentiating the right hand side of 


the derivative is negative providing 
Lip '(b,))| — & '(5b7) exp {—}[# '(56,)]"} > 0 


Z.lla 6m (7) exp } 


If by = $ and 6 Ss 3} (which has been assumed) (2.11a 
P, the derivative of the left hand side of (2.1la) is found to be positive. 
Sines 2.1la) is zero at &; QO, (2.1lla 


and it follows that 


is clearly satisfied. If 


is always satisfied 


the left hand side of 


<T-(T 


2.11b u 


Similarly, the lower limit for u must lie on o 


that 


and we conclude 


T 


Z.1le 
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Note that if r= m+ 1,s>0,(2.11b) isreplaced by » < T, while if r < m 4+ 1, 
s = 0, (2.11c) is replaced by p 2 —~. 
In case (B) (du/dc) > 0, (du/d®7) S O. Arguments analogous to these above 
lead to the interval for u 
‘T — (T — x,)® '(o,)/[@ (yp) — ® 
T _ ‘? me x,)P (®, [p '(,) —_ @ 1 5 
s>0.Ifr = m+ 1,s > 0, the lower bound for up becomes 7’; 
ifr<m- s = 0, the upper limit becomes + ~. 
In case (C) we break the interval (®, , ®y) into (#, , 4) and (4, &,). In the 
interval (®, , }) one has from case (B), ifr <m-+1,s> 0, 
(2.13a Tsu T —(T —2,)®'(€,)/[®'(&,) — &'( 5, 


In the interval (4, ®,) we have by case (A), 


(2.13b) T — (T — 2,)®'(,)/[@ (by) — &'(64,)| 


and we combine (2.13a) and (2.13b) in the obvious way. If r = +i1,2>0 
(2.13a) and (2.13b) remain unchanged. If r < m + 1, s = 0, the upper limit to 
2.13a) becomes + ~ and the lower limit to (2.13b) becomes — ~ leading to a 
trivial interval on uw. Note that in all cases considered the confidence region is 
closed providing we never take zp) as a lower limit in (2.5); 1.e., providing we 
exclude the choice s = 0. 

The above discussion can be modified in an obvious way to yield one sided 
intervals on uw or o but we omit that analysis. 

Now we need to discuss the choice of r, s and 6 to achieve the desired 8’ con- 
fidence. First we note that the confidence levels achievable will be somewhat 
limited by the confidence levels which can be obtained for @;. Aside from this 
we note that the smallest value of m, given 6, for which we can take s > 0, 


r<m-+ 1Landstill achieve conditional protection of at least 8’ is determined by 
(2.14) yy" — §" > Br. 


We observe the left hand side of (2.14) has a maximum, for given m, at 6 3 
Hence taking 6 + will allow us to get two sided limits on uw and o with r < 
m+ 1,s > 0 fora lower value of m than will be permitted for any other value of 


6. For values of m so low that (2.14) cannot be satisfied even for 6 = 4, our 
earlier discussion suggests taking s = 1,r = m + 1 and then choosing 6 so that 


2.15) ™> 2’. 


Unfortunately for values of 8 used in practice (2.15) may violate our assump- 
tion 6 < 3 for m small enough. Thus, for very small m we would take s = 0, 
r = mand choose 6 to insure 


(2.16) 
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which can always be done. Thus for some small values of m one will get un- 
bounded intervals in the effort to insure bounded confidence. For values of m 
for which it is feasible to take 6 + it is natural to take r and s as the (sym- 
metric) extremes from the maximum and minimum order statistics, allowing, 
(2.5) to be satisfied. Since for small samples this may be somewhat conservative, 
an alternative is to take 6 sufficiently less than 4 for the given r and s, so that a 
conditional coverage of exactly 6° is obtained. Other alternatives may occur to 
the reader but the lack of exactness is a trivial question in any case. The case 
m 0, as indicated earlier, is degenerate and can be included in the discussion 
above by associating with m = 0 the statement 0 < o < =~ as well as the appro- 
priate interval on ®7 (which implies —-x <p< « 

Now we turn to the question of the asymptotic behavior of the intervals we 
have defined. To this end we note that (2.5) can be written as 


2.17 > ("") fa 


It is clear that for large m (2.17) can be written approximately as 
2.18 (27) exp (—4}u°) du 


where 
6m + B®, m’s’(1l — 6 


s 6m + &, m’d'(1 — 6 


and &; and #; are both finite. For m large our discussion has taken 6 , and 
®; —?, -p “[4(1 — 6’)], although we could also take any 6,0 < 6 < 3 


and any finite pair ®; , 2 satisfying (2.18). In any cases it is easy to show that 
asn— x, withr and s defined by (2.18 


are each N(O, 1). This, of course, requires taking into account the asymptotic 
distribution of m, which, for simplicity, is not displayod explicitly in (2.19 


It follows that for large n and any e > 0 


1 — 6) ‘ l 
es >er? SO ). 
n ne 
i l 
> €? 0 : 
Tle 


so that z, and z, both converge in probability to 6. This fact, in conjunction with 


“of 
the convergence of ®,; and ©, to ®r allows us to conclude that the confidence 
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intervals on w and o separately will indeed converge as n — ~« for any 6 S 3 
and for r and s chosen in any of the ways suggested above. 

b. Fixed sample percent point censoring. We again assume censoring is to the 
right. This case can be treated in a manner identical to that of case (a) by proper 
identification. Thus, here we suppose that the n — m — 1 largest observations 
are censored. Denoting the first m + 1 order statistics by x, x2, +--+ , Um4i, We 
identify P{(2nai — w)/o| with ®[(7 — yw)/o| and consider the conditional dis- 
tribution of 2,22, °°: 5am given am4,. The same reasoning as used in case (a 
leads to formally identical results with x,,4; replacing T. It is clear that the con- 


fidence intervals obtained for this case will converge to points for large samples 


for 6 S 3 andr and s chosen as indicated in case (a). The only real difference 


from case (a) is the random nature of x, rather than m and this causes no diffi- 
culty because of the stochastic convergence of x,, to the appropriate percent point 
of the normal distribution. 

It should be remarked that for this case one can derive many other confidence 
regions for « and o without resorting to the conditional procedure outlined above. 
For example, if we obtained two sided confidence limits on two percentiles, this 
would generally give a closed region on uw and a. The approach of this paper does 
have the virtue of providing a uniform treatment of both types of censoring. A 
more meaningful criterion to use in choosing between our approach and the un- 
conditional approach would be the shortness of the confidence intervals obtained. 
However, we do not pursue this question. 

c. Mixed censoring. As a practical matter many life-testing experiments are 
a mixture of cases (a) and (b) above. Thus one frequently finds that a life test 
is terminated if either r out of n items have failed or if some fixed length of time, 
T, has elapsed. In such cases r will generally be a large fraction of n, reflecting 
the thought that the further results give only a modest amount of information; 
the choice of T, on the other hand, may reflect the extreme life expected or of 
concern, or perhaps the urgency of the need for information. Procedures anala- 
gous to those of case (a) or (b) may also be developed for this situation and are 
sufficiently different to warrant separate consideration. 

In this case the sample likelihood has two forms depending on whether m, the 
observed number of uncensored observations is equal to r or less than r. If m <r, 
we have exactly the situation described in case (a) and can follow the procedures 
described for that case. If m r, we are in a situation like case (b) but not 
identical. The distinguishing feature is that we must have x, < 7, i.e., the dis- 
tribution of m is truncated at m = r. The procedure of case (b) still applies to the 
boundaries of the confidence region defined by the conditional distribution of 
Uy. to, °°* 5 ty given x,. But because of the truncated distribution of m, at 
m r one only has the trivial upper confidence limit of unity for ®; . However, 
one can obtain on the basis of usual binomial theory a non-trivial lower con- 
fidence limit for ®y at the 8 level, say ®, . It is also appropriate to assert that 
P(r, — uw) /o| S &, because of the manner in which the statement on 7 is ob 
tained. From ®; = ®, we have 
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we are led to conclude 
> O4 pl ®P, 


where ® Oi( zr — p)/o}. 

The related conditional statement at level 8°, for appropriately chosen 4, 

sandt(r >t > s > 0) is just 
[> @ 


o 


— x,)/[@ '(&,) — &* (a, 


T 


Of course, if (7 .4f> Bi aorira i ¢ 0 2.22) must be modified as indi- 
cated earlier. We observe that the left hand side of (2.21) is monotone increasing 
in o and approaches ®; as o — ~«. Coupling this with previous discussion it is 
clear that at least for (r > t, s > 0) (2.21) and (2.22) provide a closed region in 
the (o, ®,) plane. 

It is clear, from what has been said, that for r > ¢t, s > 0 an upper limit to a 
is given from (2.21) by 


and from 
2.23b ; T (6 '(o,) —® 


Since in (2.23a) da/d&, > 0 while in (2.23b) da/d®, < 0 it follows that (2.23a 
and (2.23b) have a single point of intersection and that the value of ¢ at this point, 
o, say, is an fact the upper limit to o implied by (2.21) and (2.22). The value of 
is probably most easily found by trial and error using the monotonicity 
properties just described. A lower limit for ¢ follows from previous discussion by 
substituting ©, for ®, in the left hand side of (2.22). Again the results must be 
slightly modified if (r Le >) or(r >is 0). The details are omitted. 
To obtain limits for u we must go through an argument similar to that of case 
a). There are three cases to consider. We consider only the situation for which 
(r >t,s > 0). The necessary changes if (r fe>U)ia@itr>ta QO) can 


easily be ascertained as indicated earlier. The three cases are, as in (a 


A) ®,, , the lower limit to & 


To determine ®,, one must solve (2.23a) with 
a (Tr x,)/|® '(4, 


This again is most easily done by trial and error. The analysis parallels that of 
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case (a) and one is led to, for case (A), using D(é, p) = ® (p) —® (dp), 
(2.24a) 2x (T — x,)® '(%,)/D(6,®.) <p 
l ° 

\ (@,, ) Di 6, ?,, : 
and for case (B 
and for case (C 
(2.24¢ tf ST 

Pp (d,7 ) dD 6, ®,, 


To meaningfully discuss asymptotic behavior of the confidence intervals for 


this case, it is clear that one must require r/n to approach a constant, say p, as 


n— «.Itis obvious that if p > ®7, we will in the limit be essentially in case (a 
so that for suitably restricted 6, ¢ and s the argument of case (a) applies with 
almost no change. It is only if p S @7 that a new asymptotic argument is needed. 
It is clear that such an argument will follow the lines indicated for case (b) and 
it is omitted. 


3. Extension of results. It is clear that the results of Section 2 do not depend 
crucially on the assumption of normality except in the implications of the regions 
on a and ®; (or ®,) for limits on uw and o. Thus the procedure should be applicable 
to other distributions depending only on scale and location parameters. We 
illustrate this by sketching the extension of the results of Section 2(a) to the 


case of sampling from 
pla (1/o) exp [—(a — p)/ol, os423 2 < @, o>. 


The entire argument of Section 2 up to and including (2.8) then is applicable to 
3.1) by the trivial re-definition of ® as the cumulative distribution function of 
x — uw) /o, where x obeys (3.1), and ® * (@) as the value of (« — w)/o exceeded 

with probability 1 a. The same type of argument as that following (2.8) leads 

to limits for ¢ which are identical with those given by (2.10) upon proper defi- 
nition of ® ~ and &. The discussion concerning limits for » follows identical 
lines being rendered even simpler and not requiring 6 S 3} because of the non- 
negative character of ® (a); the results for uw are in fact identical with those given 
in Section 2, upon proper identification and, of course, taking into account that yu 
is essentially non-negative (although, formally, this is not required). Obviously 

there is no difficulty in the consistency argument for properly restricted r and s. 

It is evident that the results may hold for many classes of distributions under 
appropriate conditions on the parent densities. However, we do not pursue this 
question further. 
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4. An example. As an example, we consider some data falling in case (a); the 
maximum likelihood estimates of » and o are included as a matter of interest. 
Thus consider a random sample of 30, drawn from a normal distribution with 
zero mean and unit variance, with 7 1. The data, obtained from a table of 
random normal deviates, are summarized below in order of size. 


805 11 0.482 ; 0.658 
.787 12 139 22 0.906 
.501 3 105 23 >1 
399 14 - 005 

376 15 O41 

339 060 

186 17 159 

32 18 199 

010 19 0.279 
.690 20 0.464 


A 95% confidence interval on ®7 based on 22 out of 30 sample units uncensored is 


given by .53 S 7 S .88. For a sample of 22 one finds that the shortest sym- 
metric confidence interval for the median at least at the 95% level is given by 


ve and 2x7. The exact confidence level is in fact .9831. One easily finds using 
binomial tables that 6 42 makes xg and wy, almost exactly 95% confidence 
limits on the 42% point of the distribution. The formulae derived above imme- 
diately give 


—.83 Sus 916, 


-— 27 
S 2.19, 


0 


as confidence limits each at least at the 90% level. Computation of the maximum 
likelihood estimates of u and o for these data gives estimates, 


a 12, 6 1.28 


Note that 4 is near zero and well within our confidence limits on yp while é is near 
the lower limit on o. This, of course, is a matter over which we have no control, 
since the confidence interval estimates and point estimates are based on different 
principles. We would, however, generally expect our point estimates to lie within 
the confidence intervals, especially with increase in sample size. 


5. Some final remarks. Some consideration was given to the problem of obtain- 
ing confidence intervals for « and o when one has double censoring. A procedure 
along the lines of Section 2 was studied but it appeared that the confidence 
region based on order statistics of the conditional truncated distribution of un- 
censored items and the frequency split of the data into the three possible cate- 
gories might not define a useful region on uw and oc. In particular, there was some 
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indication that the confidence statement on yu derivable from the region might, in 
some instances, involve disjoint intervals which would be unacceptable. Atten- 
tion was also given to the use of order statistics for obtaining confidence limits on 
u and o in the singly truncated (to the right) normal distribution. Using four 
order statistics and two percent points to obtain a region along the lines of 
Section 2, it turned out that although o could be bounded as a function of ®7 
and various order statistics, one could not bound ®, other than trivially. It is 
possible but not, in the writer’s opinion, likely, that schemes more complex than 
that of Section 2 for using the order statistics might allow a solution. 

Although the point has not been emphasized, it is clear that the results of this 
paper are conservative in two senses. On the one hand we are using a bounded 
confidence region. On the other hand we are using selected order statistics of the 
uncensored portion of our sample rather than all the uncensored data. It is felt 
that the conservatism is considerably mitigated by the fairly general nature of 
the results and the bounded confidence property. 


Some readers of an earlier version of this paper have suggested that confidence 
intervals based on asymptotic parametric theory can hardly be so bad that the 
procedure of this paper should be preferred. This may very well be true but un- 
fortunately appears almost impossible to verify analytically and has not yet been 
investigated numerically. Thus, a choice between the procedures of this paper 


and asymptotic parametric theory is presently a choice between somewhat long 
intervals at known confidence levels and relatively short intervals with con- 
jectural confidence properties. 

Various papers treating the asymptotic parametric theory are given in the 


bibliography compiled by Mendenhall [1]. 
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TESTS OF FIT BASED ON THE NUMBER OF OBSERVATIONS FALLING 
IN THE SHORTEST SAMPLE SPACINGS DETERMINED BY EARLIER 
OBSERVATIONS’ 


By LioneL WEISS 
Cornell University 


1. Introduction. Suppose the random variables X,, X2,--- are known to 
be independent and identically distributed, with a continuous cumulative dis- 
tribution function which is otherwise unknown. The problem which this paper 
discusses is the familiar one of testing the hypothesis that the cumulative distribu- 
tion function is equal to a given completely specified cumulative distribution 
function G(x). By using the random variables G(X,), G(X2), --: in place of 
ee. Ce ae the problem becomes that of testing the hypothesis that the 
common cumulative distribution function of G(.X,), G(X2), --- is the uniform 
distribution function U(x), where U(x rfor0 < x < 1. For the remainder 
of the paper, it will be assumed that the problem has been reduced to this form, 
so that there is no loss of generality in assuming that G(2 U(a), and that all 
distributions considered assign probability one to the closed interval [0, 1). 

Let Yi(m), Yo(n), --+, Yn(n denote the ordered values of Bi go es where 
0O< Yi(n) S Yo(n) S --- S Y,(n) € 1. For convenience, Yo(n) is defined as 0 
and Y,,4;(7) is defined as 1. T.(n) denotes the closed interval [} n), Y;(n)], 
and 7';,(n) denotes the length of this interval, for 2 l,---,n+ 1. The Ti(n 


are known as sample spacings. 


Let p be a fixed quantity in the open interval (0, 1). The set S,(p) is defined 
as the union of the shortest sample spacing, the next shortest sample spacing, 

- , until the total length of the sample spacings included in S,(p) is exactly 
equal to p. With probability one, this will require the use of a portion of the last 
sample spacing used, which for convenience will always be taken as the left-hand 
portion of the sample spacing broken up. The chance event C,,(p) is defined as 
that event which occurs when and only when the random variable X +} falls in 
the set S, Pp 

If the hypothesis of a uniform distribution is true, the chance events C,(p 
(2(p),--* are independent events, each with probability exactly equal to p. If 
the hypothesis is not true, the chance events are not independent and their prob- 
abilities are not all the same. However, the definition of the set S,(p) clearly 


favors the inclusion of those sections of the unit interval at which the true density 


function is relatively high, and it seems reasonable to suppose that the condi- 
tional probability of C,(p) given X,,--- , X, has a high probability of approach- 


ing some limit greater than p. This conjecture is proved by the theorem of 
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Section 2. Some applications to testing the hypothesis of a uniform distribution 
are discussed in the rest of the paper. The tests discussed reject the hypothesis if 
the proportion of events Ci(p), C2o(p), --- which occur is ‘too far’ above p. 


2. The basic theorem. Throughout this section it is assumed that the common 
cumulative distribution function F(x) of X, , Xe 


assigns all probability to 
the interval [0, 1], and has a derivative f(x) which is bounded and has at most a 
finite number of discontinuities. For any nonnegative ¢, Q(t; f) denotes 1 — 
fi (L + tf(x)] exp [—tf(x)]dz, and M(t; f) denotes 


1— |] of x){1 + tf(2)] exp [—tf(a)|dx. 


It is easily verified that Q(t; f) is a continuous and strictly increasing 
function of ¢t for nonnegative t, with Q(0; f 0, lim:.. Q(t; f) = 1. Denote by 
t(p) the unique solution in ¢t of the equation Q(t; f) p. 

THEOREM. The conditional probability of the event C',.(p given Xi ee OS Ke 
converges to M(t(p);f) with probability one as n increases. Also, M{t(p);f) = p, 
with equality holding if and only if f(x | almost everywhere on {O, 1). 

The remainder of this section is devoted to proving this theorem. The intro- 
duction of some detailed notation is necessary. Let Z,(n), «>: , Zn4i(m) denote 
the ordered values of T;(m), --- , Tnai(nm), where 0 S Z,(n) S --- S Znay(n 
and 2 ace Z,(n) = 1. With probability one, Z,(m) < Z.4:(m) for? = 1, +--+, n. 
J,(p) is defined as the largest integer such that 2 Zi(n) < p. The set 
S,,(p) as defined above is the union of J,(p) + 1 closed subintervals: the J,,(p 
subintervals [Y,;(m), Yi4i(m)] such that Yjss(n) — Y,(n Z;(n) for some 


j < J,(p), plus the subinterval [Y.(), Y.(m) + A], where Yi4.(n) — Yi(n) 


Z;,:p+1(n), and A is chosen so that the Lebesgue measure of S,(p) is exactly p 
\ 


t) denotes the number of the quantities Z;(m), --- , Zn4i(m) which are no 


greater than t/(m + 1), and R,(t) denotes (n + 1) N,(t). L,(t) denotes 
>» Zn), and K,(t) denotes the total probability assigned by F(x) to 


— 


Zi (n)st/(n+1 
the union of the N,(t) intervals [Y;(7), Yj4:(m)] such that Yjis4(n) — Y,(n) 
< t/(n + 1). That is, 


K,,(t > [PCY 541 — F(Y,(n))). 
are ¥;(n)St/(n+1 
LemMMa 1. L,,(t) converges to Q(t; f) with probability one as n increases. 
ProoFr oF LemMa lL. L,(t Se u/(n + 1) dN,(u), the integral being Rie- 


mann-Stieltjes. Then L,,(t fo udR,,(u tR,(t) — §5 R,(u) du. In [1] it was 


proved that 
al 
sup R,(u) — E _ f(x) exp (—uf(x)) ax | 


u “0 
converges to zero with probability one as n increases. Then, with probability one 
as n increases, L,,(¢) converges to 


1 t 1 
t E —_ [ f(x) exp (—-tf(2x)) ix | — [ 1 _ [ f(a) exp (—uf(r)) az | du, 
“0 7 


0 ~0 — 
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and this last expression is easily seen to be equal to Q(t; f), completing the proof 
of Lemma 1. 

LEMMA 2. K,,(t) converges to M(t; f) with probability one as n increases. 

Proor or Lemma 2. By the assumption about f(.), for any given positive ¥ it 
is possible to break the interval [0, 1] into a finite number of subintervals, such 
that in the interior of each subinterval there is a variation of f(2) which is no 
greater than y. Suppose there are b(y) such subintervals, the endpoints of the 
ith such subinterval being denoted by ¢;, d;, with ¢; < d,;. m; denotes 
inf, <rca; f(x), and WM, denotes sup,,cr<a, f(x), where 0 S M; — m; S vy. q 
denotes F'(d — F(c;), and N, denotes the number of values among X, ee eee 
falling in the closed interval [e,; , d;]. N; has a binomial distribution with param- 
eters n, g;. It may be assumed that gq; is positive, since if q¢ 0 the interval 
c;, d;| ean be ignored in what follows. 


-w -w Ki 


Denote by Yi(7) S Ye(7) S --: S Y<.(7) the ordered values of the N 


observations in the interval {e; , di]. Yo (7), Yx,41(¢) denote ec; , d; respectively. 
T; (i) denotes Y7(i) — Yjiuli 1, ---, N; + 1. .N,(t) denotes the 
number of the quantities Poe): 00% Pee which are no greater than 
t/(N; + 1), and ,R,(t) denotes (N; + 1) -'\N,(t). Since (N; + 1 n+ 1 


converges to q; with probability one as n increases, it follows from [1] that 


rds ¢.) f(x) 
sup R,(t) — E — | AS exp | - J: | ax | 
t . qY q 


converges to zero with probability one as increases. From this it follows by the 
same sort of calculation used in the proof of Lemma | that =: T; (2) 


—_ 
<t/(N;+1 


* tf(x) tf(x) 
d;—e -| E i -" Jexo[ - = Jaw = p,(t), 


‘ one ° ony? ° 

say, with probability one as n increases. >> j.77¢)<¢/(n41) T) (7) can be written as 
. my 7° . : , 

= 7? (ay<t*/(ny41) 1; (7) where ¢* t(N; + 1)/(nm + 1), and because (N; +1 


(n + 1) converges to g; with probability one as n increases, and p;(t) is a con- 


converges to 


tinuous function of ¢, it follows that > r’«iy<tiin+i) 1’, (7) converges to p;(qit 
with probability one as nm increases. 
Denote the total probability assigned by F(x) to the union of all the sub- 


-w ° - _w ww 


intervals (Y;-1(2), Y;(2)) with Y; (7) — Yji(z) S t/(n + 1) by 6,(t 


<M - (We 


can be written as p,(qit) + 6,(n), where 6,(n) converges to zero with probability 


one as 7 increases. 
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b(y 
K,(t) = >, O:(t) + €, 
t=] 


where e, represents a term that takes account of the fact that at most 2b(y) of 
the original subintervals (Yj1(n), Y;(m)) were broken into two parts by the 
points c;, d;. Clearly, €, converges to zero with probability one as n increases. 
Then 


bly b b(y) 
a mipi( qi >. mé(n) +e = K,(t) S ze M ipi(qit) + yb M 6(n) + & . 
i=1 t=1 


b 


a 


1 ian > 


By taking y small enough, > Y mip.(qit) and > 1 Mip(qit) can both be 
made arbitrarily close to M(t; f). Since 6,(n), --- , & 


»(n), €, approach zero 
with probability one as n increases, Lemma 2 follows immediately. 

Define the set T*(t) as the union of the intervals (Yj,(n), Y ;(n)) for all 7 
for which Y ;(n) — Yj4(n) S n+ 1). It follows from Lemma 1 that the sets 
S,(p) and T%(t(p)) differ by a set whose measure approaches zero with prob- 


ability one as n increases. It follows from Lemma 2 that the total probability 
assigned by F(.c) to the set T%(t(p)) converges to M(t(p);f) with probability 
one as n increases. Therefore the total probability assigned by F(x) to the set 
S,(p) converges to AM/(t(p); f) with probability one as n increases. Since the 
probability assigned to S,(p) by F(x) is the conditional probability of the 
event C,(p) given X,, ---, X,, the first part of the theorem is proved. The 
second part of the theorem is a direct consequence of the following lemma. 

LemMMA 3. Q(t; f) < M(t; f) for all f(x) and for each positive t, with equality if 
and only af f(2) = 1 almost everywhere on (0, 1}. 

Proor oF Lemma 3. M(t; f) — Q(t; f) can be written as 

~1 
(1 + tf(x)\[1 — f(xv)] exp (—tf(a)) dz. 

The function {1 + ¢tf(.)] exp (—tf(x)) is equal to unity when f(x 0, and de- 
creases strictly monotonically toward zero as f(.) increases. Then M(t; f) — 
Q(t; f) can be written as 


(1 — fla Jl + tf(x)] exp (—tf(ax)) dx 


Jz-f(r)<1 


- {1 _ fi x) + tf x)] exp ( —tf( x)) dz. 


1 


The first of these integrals is at least equal to (1 + t)e fesmeill — f(x)| dz, 
with equality if and only if the subset of [0, 1] where f(2) < 1 has measure zero, 
and the second of the integrals is at least equal to (1 + t)e fess 21 [1 — f(a)] de, 
with equality if and only if the subset of [0, 1] where f(.2) > 1 has measure zero. 
Therefore 


M(t; f) —Q(¢6f) 214+ tye ‘|| {1 —f(x)| dx 
L°*2z:f(2)<1 


4 [ tif r)) ae | = 0, 





842 LIONEL WEISS 


with equality if and only if f(x | almost everywhere on [0, 1]. This completes 
the proof of Lemma 3, and of the theorem. 


3. Application of the theorem to a nonsequential test of fit. Let W; denote the 
random variable which is equal to one if the event C;(p) occurs, and is equal to 
zero otherwise, for 2 1,2,--- . Define V, as W, + --- + W,. If the hy- 
pothesis of a uniform distribution for X,, X2, --- is true, then V, has a bi- 
nomial distribution with parameters 7, Pp. If ce RR aes are observed, a 
possible test of the hypothesis is to reject when V,,/n 2 d,(a@), where d,(a@) isa 
constant chosen to give the desired level of significance a. If n is large, then 
d,(a) is approximately p + z.{p(1 — p)/n]’, where (27)? [2 exp (— 4¢°) dt 


a. 

The consistency of this proposed test will be shown if it is shown that V,,/n 
converges stochastically to M(t(p); f) as n increases, since M(t(p);f) > p if 
the hypothesis is not true, and the critical value for V,,/n approaches pas n 
increases. The convergence of V,,/n to M/(t(p);f) is shown as follows. 

For convenience, let X J denote the sequence it. ee X ; . and let ) 
denote W; — M(t(p); f). Define r; as E\E(W; | X(j)) — M(t(p); f)|. Since 
E(W, | X(j)) is simply the conditional probability that C;(p) will occur, given 
X(j), the theorem of Section 2 shows that r; approaches zero as 7 increases. 
Clearly, 0 < r; S 1 for all 7. If 

,L(QQ E\E(Q:Q;|X(j))} = E{Q:E(Q; | XQ 

j, Q; is a uniquely determined function of X(j). Also, 
< E \Q:E(Q;| XQ E}\Q;| |E(Q; | X(7)) 
= E\E(Q 


the last neq ality holding because 0); 1 with probability one. 


n — M(t(p);f) 


=n ‘|S E(Q;) +2>° E(Q Q | <n ‘| +2) r| <n '+2n ee 
i i<j jul 


approaches zero as j increases, this last expression approaches 


and because r; 
zero as n increases. Therefore E(V,/n — M(t PP); f))° converges to zero as 7 
increases, and the stochastic convergence of V,/n to M(t( p); f) follows from 
Chebyshev’s inequality. 

The value of the statistie V,,/n may change if the values of X,, «++, Xn4: 
are permuted. In most fixed sample size tests of the hypothesis, the statistic is 
invariant under permutation of the observations. 


4. A sequential test of fit. When the hypothesis of uniform distribution is 
true, W,, We, --- are independent random variables, with P(W,; = 1) yp 
for 2 1, 2, --- . When the hypothesis is not true, W,, W.,--- are not 
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independent, but it has been shown that P(W, 1|X,,---,X,) converges 
with probability one to M(t(p); f) as n increases, where f(x) is the common 
probability density function of X,, X., --- . Wald’s sequential test [2] that 
a binomial mean has a given value p against the alternative that its value is 
~Pi(pi > p), with error probabilities a, 8 can be applied to the test of uniform 
distribution, as follows. Let a,, denote 


log [8/(1 — a)] + m log [(1 — p)/(1 — py)] 
log [pi(1 — p)| — log [p(l — p,)] 


and Pan denote 


log [((1 — 8)/a] + m log [(1 — p)/(1 — p,)) 


log [m(1 — p)| — log [pd — p)] 


The test continues as long as a,, << W, + --- + W,, < 1r,,. The first time that 
these inequalities do not hold, the hypothesis of uniform distribution is accepted 
ifW,+ --- + W,, S an, and is rejected if Wit---+ Wa, 2 Ta. 

When the hypothesis of uniform distribution is true, W,, We, --- are 
independent random variables with P(W, 1) p, and therefore the prob- 
ability of acceptance and the expected number of observations of the sequential 
test are known, at least approximately, through the Wald approximations. In 
particular, when the hypothesis is true the probability of its rejection is approxi- 
mately a. 

Since it has been shown that when the common probability density function 
is f(x), P(W, 1,|X,,---,X,) converges with probability one to M(t(p); f) 
as n increases, it is tempting to say that when the common probability density 
function is f(x), the sequential test has approximately the same properties as the 
Wald test for the binomial case when the binomial mean is equal to M(t(p);f) 
However, there are certain obstacles, which will now be discussed, in the way of 
doing this. 

The first obstacle is the following. Even if P(W,, Li Ay, °**, An) Were 
close to M(t(p);f) for all n, might the small differences lead to large differences 
between the properties of the proposed test and the properties of the Wald 
binomial test? A negative answer to this question can be given, since the proper- 
ties of the Wald test vary continuously with the binomial mean. 

A second obstacle is the following. Convergence with probability one as n 
increases does not rule out fairly large probabilities of large differences between 
P(W, = 1)! Xi, ---,Xn) and M(t(p);f) for small values of n. Might this lead 
to large differences between the properties of the proposed sequential test of 
uniformity and the Wald binomial test? The exact answer to this question re- 
mains unknown, and may be in the affirmative in general. However, if a and 8 
are small, a large number of observations will be taken, as is obvious from the 
form of the Wald test. For the later observations in the sequence, the conver- 
gence theorem applies. From the form of the decision boundaries that the Wald 
test applies to the random walk, it is easily seen that even large disturbances in 
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P(W Li} as _ = for a relatively few initial values of 7 will have only 
a small effect on the properties of the test. Thus this second obstacle diminishes 
as a and 6 become smaller. 

A third obstacle is the following. What is the effect of those sample sequences 
for which P(W, 1 | X,, ---, X,) is not close to M(t(p); f) even for large 
values of n? Since it has been shown that such sample sequences have a prob- 
ability approaching zero as n increases, it is clear that they cannot have much 
effect on the probability of rejecting the hypothesis, if a and 8 are small. How- 
ever, these sample sequences may conceivably have a large effect on the expected 
sample size, since the sample size is an unbounded function over all possible 
sample sequences, and the expected value of an unbounded function can be 
greatly affected even by small changes in the probabilities of large values. 
Whether this cifect on the expected sample size actually exists is an open ques- 
tion. However, the probability that the sample size will be less than m, for any 
fixed m, will not be affected much, since this probability is a bounded function. 


5. Comparison of the sequential test with other tests. In trying to compare the 
sequential test of uniform distribution with existing tests of this hypothesis, a 
difficulty is that for very few of the existing tests (which are all predetermined 
sample size tests) is even the asymptotic power known. However, in [3] the 
asymptotic power of the test which rejects when >: tT Zi(n) is “too large’ 
(n is a predetermined number) is found. Furthermore, in [4] it is shown that 
this test is admissible among all fixed sample size tests. This test will be called 
the “*Z test”? for convenience, and will be compared to the sequential test. 

lor computational purposes, p will be set equal to $ and @ will be set equal to 
8. f(x) will be written as 1 + er(.c), where |r(.r)| is bounded, 


0, f(r) de =D>0. 


“0 


and small absolute values of ¢ are of interest. After straightforward but some- 
what lengthy calculations, it is found that ¢(4 1.6784 + .5693De + o(e 
and using this, that /7(t(4);/f B+ .5259D¢ “+ o(¢’). Set —r ; + 5259Dc’, 
so that the power of the sequential test of uniform distribution against the 
alternative f(x 1 + er(x) is approximately 1 — a. Denote by F(a, c) the 
expected sample size when the sequential test is used and the hypothesis of uni- 
form distribution is true. Then from the known Wald approximations, it is 
found that asymptotically 


E ' (1 — a) log (a/(1 — a)) + a log ((1 — a)/a) 
Mea eh 
Y log (1 + 1.0518Dc?) + 4 log (1 — 1.0518Dc?) 


2 log a — 4ta log a + (4a —_ 2) log (1 — a) 


— 1.1067 D?c* +. o(c*) 


the approximation becoming better as c becomes smaller in absolute value. 
Next the sample size necessary to give the Z test level of significance a and 
power 1 — a@ against the alternative f(x) 1 + er(x) is found. Denote this 
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sample size by N(a, ¢), and denote by k(@) the quantity satisfying 
2r Jkca) exp (—3f) dt a 


It is found directly from [3] that N(a@, ¢c) is asymptotically the solution for ” in 


ir | + h(a) 
| (1 + er(x)) ae 4 


the equation 


the approximation becoming better as @ decreases. Since k(1 — a 


N(a, ¢) is asymptotically equal to 


12 2+ 10Dc + e) + 21 + 10DeE +. ofe'))* 
— D?c* + ofc) , 


l'rom page 166 of [5], we find that —2 log a approaches log 27 + 2 log k(a) +4 
k(a) asymptotically as a@ approaches zero. From this, it follows that 
lima+o N(a, c)/E(a, c 4.4268 + 6(c), where lim,.o 6(¢ 0. Thus for a 
and ¢ near zero, the sequential test of uniform distribution requires on the aver- 
age only 1/ 4.4268 times the number of observations required by the Z test of the 
same size and power against the alternative f(x 1 + er(x), when the hy- 


pothesis is true. This is a substantial saving. 


6. Acknowledgment. The author would like to thank the referee for many 
useful suggestions. In particular, the referee suggested the fixed sample size test 


described in Section 3 as an application of the basic theorem. 
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with equality if and only if f(z) = 1 almost everywhere on [0, 1]. This completes 
the proof of Lemma 3, and of the theorem. 


3. Application of the theorem to a nonsequential test of fit. Let W; denote the 
random variable which is equal to one if the event C;(p) occurs, and is equal to 
zero otherwise, fori = 1,2, --- . Define V, as W, + --- + W,. If the hy- 
pothesis of a uniform distribution for X,, X2:, --- is true, then V, has a bi- 
nomial distribution with parameters n, p. If Xi, ---, Xn4: are observed, a 
possible test of the hypothesis is to reject when V,/n = d,(a@), where d,(a) isa 
constant chosen to give the desired level of significance a. If n is large, then 
d,(a) is approximately p + za[p(1 — p) ‘n}', where (2x) f2 exp (— 30) dt = 
a. 

The consistency of this proposed test will be shown if it is shown that V,/n 
converges stochastically to M(t(p); f) as n increases, since M(t(p);f) > p if 
the hypothesis is not true, and the critical value for V,/n approaches p as n 
increases. The convergence of V,/n to M(t(p);f) is shown as follows. 

For convenience, let X(j7) denote the sequence (X,, --- , X;), and let Q; 
denote W; — M(t(p); f). Define r; as E|E(W; | X(j)) — M(t(p); f)|. Since 
E(W ,; | X(j)) is simply the conditional probability that C;(p) will occur, given 
X(j), the theorem of Section 2 shows that r; approaches zero as j increases. 
Clearly, 0 < r; S 1 for all j. If 


i<j, E(QQ;) = EJE(QQ;|X(¥j))} = EIQ:E(Q; | X())} 
since if 7 < j, Q; is a uniquely determined function of X(j). Also, 
E\Q.E(Q;| X(j))} Ss E\Q:E(Q; | X())| = E{\Q:| |E(Q; | X(j)) |} 

= E \E(Q;\ X(j))| 


the last inequality holding because |Q,;| S 1 with probability one. 


n 2 
E(V,/n — M(t(p);f))* =E (n= >. @.) 
t=1 


= “|S E(Qi) +2 >> E(Q; a | < vn +2> n| <n'+ In) 7;, 
t=1 


t<j <j j=l 
and because r; approaches zero as j increases, this last expression approaches 
zero as n increases. Therefore E(V,/n — M(t(p); f))* converges to zero as n 
increases, and the stochastic convergence of V,/n to M(t(p); f) follows from 
Chebyshev’s inequality. 

The value of the statistic V,/n may change if the values of X,, ---, Xn4: 
are permuted. In most fixed sample size tests of the hypothesis, the statistic is 
invariant under permutation of the observations. 


4. A sequential test of fit. When the hypothesis of uniform distribution is 
true, W,, We, --- are independent random variables, with P(W; = 1) = 12 
fori = 1, 2, --- . When the hypothesis is not true, W,, W2, --- are not 
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independent, but it has been shown that P(W, = 1|X,, --- , X,) converges 
with probability one to M(t(p); f) as n increases, where f(x) is the common 
probability density function of X,, X:, --- . Wald’s sequential test [2] that 
a binomial mean has a given value p against the alternative that its value is 
~i(p~pi > p), with error probabilities a, 8 can be applied to the test of uniform 
distribution, as follows. Let a,, denote 


log (8/(1 — «)] + m log [(1 — p)/(1 — pi] 
log [pi(1 — p)| — log [p(l — p,)] 


and r,, denote 


log [((1 — 8)/a] + m log [(1 — p)/(1 — p,)] 
log [pi(1 — p)) — log [p(1 — p,)] 


The test continues as long as a,, < W, + --- + W,, <r». The first time that 
these inequalities do not hold, the hypothesis of uniform distribution is accepted 
ifWi,+ --- + W,, S an, and is rejected if W, + ---+ War Zzrn. 

When the hypothesis of uniform distribution is true, Wi, We, --- are 
independent random variables with P(W; = 1) = p, and therefore the prob- 
ability of acceptance and the expected number of observations of the sequential 
test are known, at least approximately, through the Wald approximations. In 
particular, when the hypothesis is true the probability of its rejection is approxi- 
mately a. 

Since it has been shown that when the common probability density function 
is f(x), P(W, = 1| Xi, --- , Xn) converges with probability one to M(t(p); f) 
as n increases, it is tempting to say that when the common probability density 
function is f(a), the sequential test has approximately the same properties as the 
Wald test for the binomial case when the binomial mean is equal to M(t(p);f). 
However, there are certain obstacles, which will now be discussed, in the way of 
doing this. 

The first obstacle is the following. Even if P(W, = 1\| Xi, ---, Xn) were 
close to M(t(p); f) for all n, might the small differences lead to large differences 
between the properties of the proposed test and the properties of the Wald 
binomial test? A negative answer to this question can be given, since the proper- 
ties of the Wald test vary continuously with the binomial mean. 

A second obstacle is the following. Convergence with probability one as n 
increases does not rule out fairly large probabilities of large differences between 
P(™, = 1| Xi, ---,X,) and M(t(p);f) for small values of n. Might this lead 
to large differences between the properties of the proposed sequential test of 
uniformity and the Wald binomial test? The exact answer to this question re- 
mains unknown, and may be in the affirmative in general. However, if a and 8 
are small, a large number of observations will be taken, as is obvious from the 
form of the Wald test. For the later observations in the sequence, the conver- 
gence theorem applies. From the form of the decision boundaries that the Wald 
test applies to the random walk, it is easily seen that even large disturbances in 
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P(W, = 1| Xi, --- , X,) for a relatively few initial values of n will have only 
a small effect on the properties of the test. Thus this second obstacle diminishes 
as a and 8 become smaller. 

A third obstacle is the following. What is the effect of those sample sequences 
for which P(W, = 1|X,, --- , X,) is not close to M(t(p); f) even for large 
values of n? Since it has been shown that such sample sequences have a prob- 
ability approaching zero as n increases, it is clear that they cannot have much 
effect on the probability of rejecting the hypothesis, if a and 8 are small. How- 
ever, these sample sequences may conceivably have a large effect on the expected 
sample size, since the sample size is an unbounded function over all possible 
sample sequences, and the expected value of an unbounded function can be 
greatly affected even by small changes in the probabilities of large values. 
Whether this effect on the expected sample size actually exists is an open ques- 
tion. However, the probability that the sample size will be less than m, for any 
fixed m, will not be affected much, since this probability is a bounded function. 


5. Comparison of the sequential test with other tests. In trying to compare the 
sequential test of uniform distribution with existing tests of this hypothesis, a 
difficulty is that for very few of the existing tests (which are all predetermined 
sample size tests) is even the asymptotic power known. However, in [3] the 
asymptotic power of the test which rejects when 7 Z?(n) is “too large”’ 
(n is a predetermined number) is found. Furthermore, in [4] it is shown that 
this test is admissible among all fixed sample size tests. This test will be called 
the “Z test” for convenience, and will be compared to the sequential test. 

For computational purposes, p will be set equal to } and a will be set equal to 
8. f(x) will be written as 1 + er(x), where |r(x)| is bounded, 

1 1 


[ r(x) dz = 0, [ (x) dx = D> 0, 


/0 “0 

and small absolute values of c are of interest. After straightforward but some- 
what lengthy calculations, it is found that 4(4) = 1.6784 + .5693Dc’ + o(c’), 
and using this, that M(t(4);f) = 3 + .5259De’ + o(c’). Set p: = 4 + .5259Dce’, 
so that the power of the sequential test of uniform distribution against the 
alternative f(z) = 1 + er(x) is approximately 1 — a. Denote by E(a, c) the 
expected sample size when the sequential test is used and the hypothesis of uni- 
form distribution is true. Then from the known Wald approximations, it is 
found that asymptotically 


e (1 — a) log (a/(1 _ a)) + a log ((1 — a)/a) 
4 log (1 + 1.0518Dc?) + 4 log (1 — 1.0518Dc?) 
_ 2loga — 4a log a + (4a - 2) log (1 — a) 
—1.1067D?c* + o(c*) 
the approximation becoming better as c becomes smaller in absolute value. 
Next the sample size necessary to give the Z test level of significance a and 
power 1 — a against the alternative f(x) = 1 + er(x) is found. Denote this 


E(a, c) 











TESTS OF FIT 


sample size by N(a, c), and denote by k(a) the quantity satisfying 
(29)? fia) exp (—3}’) dt = a 


It is found directly from [3] that N(a, c) is asymptotically the solution for n in 
the equation 


ee 


(1 + er(z) "ae | + k(a) 
erocorntaariaeccodiirephetemmeataett: th DEL. at aih. 
(1 + er(x))™ as | 


ee : i 
2 [ (1 + er(z)) “dx — | 
{ /0 0 / 
the approximation becoming better as a decreases. Since k(1 — a) = —k(a), 
N(a, c) is asymptotically equal to 
12(,) | 2+ 10De* + o(c*) + 2(1 + 10De* + o(c*))* 
a He a — — |. 
D*c* + o(c*) 





From page 166 of [5], we find that —2 log a approaches log 2x + 2 log k(a) + 
k*(a@) asymptotically as a approaches zero. From this, it follows that 
limaso N(a, c)/E(a, c) = 4.4268 + 6(c), where lim..o6(c) = 0. Thus for a 
and c near zero, the sequential test of uniform distribution requires on the aver- 
age only 1/4.4268 times the number of observations required by the Z test of the 
same size and power against the alternative f(x) = 1 + er(x), when the hy- 
pothesis is true. This is a substantial saving. 


6. Acknowledgment. The author would like to thank the referee for many 
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described in Section 3 as an application of the basic theorem. 
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SOME NONPARAMETRIC MEDIAN PROCEDURES 
By V. P. BHAPKAR 


University of North Carolina and University of Poona 


1. Introduction and summary. Under the nonparametric approach, various 
methods have been suggested to avoid the assumption of normality and ho- 
moscedasticity implicit in the analysis of variance. For the one-way classifica- 
tion, i.e., to decide whether c samples come from the same population, Kruskal 
and Wallis [9] have proposed the H-test based on ranks; Mood and Brown [10] 
have proposed the M-test, utilizing the numbers of observations above the 
median of the combined sample; while the present author [2] has offered the 
V-test based on the number of c-plets that can be formed by choosing one ob- 
servation from each sample such that the observation from the kth sample is 
the least, K = 1,2, --- , ¢. 

For the two-way classification, Friedman [8] has made use of ranks. His 
statistic, to test the hypothesis that the rankings by m ‘‘observers”’ of n “‘objects”’ 
are independent, essentially offers a test for the two-way classification with one 
observation per cell. Durbin [7] has given a generalization for the balanced in- 
complete block design. Benard and Van Elteren [1] have generalized it still 
further. Mood and Brown [6, 10] also have considered the two-way classification 
with one observation per cell or the same number of observations per cell. In 
the first part of the present paper, their test has been extended to cover incom- 
plete block situations. 

Mood and Brown [6, 10] have also considered some simple regression problems. 
In the present paper their methods are extended to discuss some additional 
regression problems. Next some bivariate analysis of variance problems are 
considered. The “step-down procedure”’ [11, 12] is used to reduce the problem 
to the univariate case with the other variate as a concomitant variate. The 
regression methods developed earlier are used here in these bivariate problems. 
The method seems to be perfectly general and could be extended to three or more 
variates, that is, to the multivariate situation. Most of the tests are offered 
on heuristic considerations. They are expected to be good for large samples. 


2. Extension of Mood’s test for the two-way classification to incomplete 
designs. Mood and Brown [10] have considered a test for equality of ‘‘row”’ 
effects in the usual setup with r rows, c columns and one observation per cell, 
say x;; in the 7th cell. The distribution of the x;; is assumed to have median 
vi; = a; + B; + uw, where the median of the a’s is zero as is the median of the 
8’s. By the median we shall always mean the middle or the average of the two 
middle quantities. The distributions are assumed to be continuous and identical, 
except for location. 

Under the null hypothesis that the row effects, a’s, are equal (i.e., zero), all 
the observations in a given column have the same distribution. Let Z; be the 
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median of the observations in the jth column, and form a two-way table by 
replacing the observation z,; by a plus sign if it exceeds %;, or by a minus sign 
if it does not. Let m; be the number of plus signs in the ith row. The test cri- 
terion used in [10] is 


r 2 

(1) Fg Cee > (m: -“), 
ca(r — @) i= r 

where a = }r if r is even or }(r — 1) if r is odd. Unless c is small, the x’ approxi- 
mation with r — 1 df. is used. It is suggested [10] that for practical purposes 
the large sample distribution is satisfactory if c 2 10 or even if c = 5 provided 
re 2 20. For smaller c, exact probabilities could be computed. We shall consider 
the generalization to incomplete blocks. 

Let us write n;; = 1 if the (7) combination is allowed and zero otherwise 
Let the number of observations in the ith row be c; (¢ = 1, 2, ---, r) and in 
the jth column be r; (j = 1, 2, --- ,c). Let a; = 41; if r; is even or 4(r; — 1) if 
r; is odd. Then there are a; plus signs in the jth column. Let the m; be defined 
as before. Then we expect (under Ho) m; to be approximately equal to $c, . 

Following Mood, let us derive the generating function to find the distribution 
of the m’s. Suppose ¢; is associated with a plus sign in the ith row 
(¢ = 1,2,---,7r). Let da; (4, --- , t;) consist of the sum of all terms that can 
be formed by multiplying the t’s together, a; at a time. Each term of ¢ describes 
a possible arrangement of signs in a given column. Furthermore, each arrange- 
ment of signs is equally likely; hence the probability of a particular arrange- 


; r 
ment is 1 /( ‘) 
a; 


Suppose the jth column contains observations in the 7, , j2, --+ , j-;th rows. 
Then consider the function 


ia ¢ ‘ iP 


Cc) 


There is a one-to-one correspondence between the ways of getting terms 
tj" t* --- ¢?" in the numerator of # and the arrangements of signs in the r X ¢ 
table which gives rise to m; plus signs in the ith row (7 = 1, 2, --- ,r). Hence 


ie 


6 =a > oe ere > g(m , --+ , my) ty? ty? --- 
my 


mo Mm, 


where g is the frequency function for the m’s. 


. r : . ; 
Note that ¢.; (1, 1, ---,1) = ( ‘| ® is thus the factorial-moment generating 
a; 


function for the m’s. Then &(m;) = 06/dt; with all the t’s = 1. 
We note that 


dba; 


at (ty, , +++ 5 t,,) = 0, if nj = 


7 ba,—1(t;, lea ti, .)s if ny = 
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where one of the ¢’s from the previous bracket is missing. Hence, 


“LACT EE (th acto eet 
i \t 
E(m;) = I 4 ; Na, -})|- > mis - > 


j=l 


Similarly, 


o® 5 ro 2 
o4 = var(m,) = ak. + &(m;) — [&(m;)}. 
Co 


From (2) we have 


= [TE ()] [35 ma ter tas 


Hence, 


[FI - LE "NY Fao ~ 1) 


kj) 


so that 
(4) 


Similarly, 


fy . 
Giiy = COV (mM; , m;) — &(m,)E(m,) 


From (2) we have 


wae ~ LEG) 


, Z Nij TI Pa;, Nitj Pa; 
Linn 


j=l 


Hence, 
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ab 
a =X nym, MHD 4 ny Do ny —, 
Ot; Oty Jemr 1) r 


j=l r(r j j=l i'si 7 


so that 
. : a; tj — a; 
(5) ov = — Do ny ny; 2 ,. 
j=1 37; —1 
Asymptotic normality. We have 


ot) =[10(") | Teta 4.) 


j=l 


m,) ty" 


Replacing t; in ® by exp (s,c;°), we have 


#'(s) = > -- “2s g(m, +--+, mr) exp >, msicz" 


my 
= moment generating function of m,c;°’s 


Let us consider log ©’ for large c. We assume that c;/c > ¢; > Oasc — «©. We 
have 


(6) log & = } is log |, /(2)I. 
j=l a; 


a 
7 S . . - 
da; aaa exp. P Six; ct, ’ 


1= 


ae r 1 Ae 
where the summation is over ( ‘) combinations of type k; , ke, ---, k,, out of 
a 


). Hence, 


Ts ! + De #5 P ~“« a + oe)! 
t=] - i=l 


LC ) . 2 7 ( 7 8c; 


5m Mii * fi 1) te +5 DE mame (° a6 


(: 
(: 


r 2 

a a;s 

=14+ ny 95 +3 > nH 
t=] j 


~ t=) r; Cj 


+5 es ij Ny; we = 8; 8) 


ot ai wae 
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so that 


—] r r 2 

T; ' a; § l a; 8; 

7 atts ys 37 ot 

log (7) da; = 2 ny 2 p+5 ns 
3 - 


t=] 7 Ci t=] rj Cc; 


1 = — Ss Sy = s; | 
+ 9 Zot. Ni Ni’; “fe — 1) — : i |> Nj = -+ Ole me 
2 “ims rj(r; — 1) (ce; ¢;-)3 2 Lis 1; Cj 


Then, from (6) we have 


, —~ coil ly * 
log ®’ = be E(m;) sci’ +5 a Oi Sia 


t=1 = i=l 


] . 
+ 5 ps Oi 88 (Cy) ° + O(c *). 


Thus for large c, we have the distribution of m,cz’’s approximated by the multi- 
variate normal distribution. Since Psa m; = > oat a; , it follows that the m’s 
are linearly dependent. Hence the above distribution is singular. Considering 
only m, m2,°°:, M41, Which have an asymptotically nonsingular normal 
distribution, we shall have a chi-square criterion with r — 1 d.f., given by 


r—1 r—1 


(7) ¥ = © ¥Y lm — &(m) [my — (mz) obs , 


t=] t’=1 


where [o(),)] = EG , Zen being the cofactor of o,, in [oi]. 

Special case. Suppose c, = (: = +--+ = Cy = Cy, Say, andr; = fr = 
Te = To, say. Then a, = a2 = --- = a, = dp, Say, Where do = $70 if ro 1s even 
and 3(ro — 1) otherwise. Also rcp = cro. Then from (3), (4) and (5) &(m,) = 
ACo/To, Cui = CoAo(1o — a) ro. ’ 


“s 


Oi? = — Apolo — Ao) Aaw/To(To — 1), t 3 i, 


where div = D6; ij Ny; - 

(i) Balanced incomplete block designs. Let \; = for all (iz’), (¢ ¥ 7’). Then 
we have ¢o(ro — 1) = A(r — 1), ou = Codo(To — ao) Ir, = a, say, and oo, 
—ao(ro — a)d/ri(ro — 1) = B, say. Let I, denote the unit matrix of order r 
and J, denote the matrix [1], . Thus, E;-) = (a — 8) L1+ 8 J+. Then 


ih oe Le ea 
ee . (a — B)[a + (r — 2)8] ea 


= vil + Jl, 
where y = ro(ro — 1)/ao(ro — ao)Ar. Let 
Zixe-v ~_ [{m; Pats (doCo/To)}, % = Ry 2, ee = 1}. 


Then from (7), x° = 2/2G4)2z = ylz’z + 2'J,+ 2]. Now 


z'Jp1Z = (= z) = |= {m; — (do can} | 


ten] t=] 
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= {m, — (¢o ao/ro)}”. 


Hence 


(8) x” = Tore - 1) ; = - se0)) 
ao(ro — “@o)Ar t=1 \ To 
If we put A = c and hence rp = r, co = c and dp = a, we get back to (1). 
In the usual terminology of the BIBD, if the “‘rows” denote the “treatments” 
and the ‘‘columns”’ denote the “‘blocks’’, then 
number of “treatments” = », 
number of “blocks” = b, 
the number of replications of any “treatment” = r, 
the number of ‘‘treatments”’ in any “block” = k, 
the number of times any two “treatments” occur together in the same 
“block” = X. (8) then reduces to 


_ kk —1) (m. - ‘) 
a(k — a)dv > S 
where a = $k if k is even and 3(k — 1) otherwise. 


(ii) Partially balanced incomplete block designs. Let us consider rows as treat- 
ments, so that \,;; = A, if i and 7’ are pth associates. Then 


z= al, + > BB, , 
p=l 


where m is the number of associate classes, a is defined as before, B’s are the 
association matrices [4] and 
ao( To 


\,2=- 


ri (ro = 


Using the results derived in [3] and simplifying, we have 


x a ro(To (a  - > Cis (m m; ink aa sees), 
ao(To _ 5 i=l i’ =] To To 
where C = (c;,) is such that the solution of the “normal equations’’ for t in 
the analysis of variance for the PBIBD is given by t = CQ, Q being defined in 
the usual notation [5]. 
In the usual terminology of the PBIBD, as indicated in the terminology for 
the BIBD, we have 


2 kKk-1)oe ra _ ra 
6) x = a(k — a) 2 Pe, oo (m. 2)(m. r). 


For the PBIBD with two associate classes, the constants ¢; and ¢» (i.e., ¢;;’s) 
are already given in [5]. 
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3. Some regression problems. We shall first state a lemma [10] which will be 
useful for later applications. 


LemMa. Let 


(11) g(m,, M,-*° 


k k . 
wheren = Doin n; and m = >o-1 m; denote the frequency function for th 
m’s. Then as n— ~ in such a way that n,;/n — p; > 0, 


. ‘ 

2 n(n — 1) 1 n;m\" 

phe MES 3 Lang an tM 
m(n — m) i= n; n 


has the asymptotic x° distribution with k — 1 df. 

Mood [10] says, ‘“The expression (11) has a distribution very closely approxi- 
mated by the chi-square distribution with k — 1 d.f. even if n is only of the 
order of twenty provided all the n, are at least five’’ 

3.1 One sample. Let (21, y1), «+: , (@n, Yn) denote a sample of n observations. 
We shall assume that 

(a) the distribution of y for any x is continuous and identical apart from a 
shift or translation, and 

(b) the regression is linear, that is, the location parameter (usually the 
median), given x, is a + 8x, where a and 8 are unknown parameters. 

To estimate a and 8, Mood and Brown [10] suggest that the estimates & and 


8 should be determined by 


Median of (y; — & — $x;) = 0 for 2; < 


Median of (y; — & — 6x;) = 0 for x;>2 


where Z is the median of the z’s. If it happens that several x values fall at %, then 
the sign < in (12) and > sign in (13) may be replaced by < and 2 if sucha 
replacement would more nearly divide the points into groups of equal size. They 
also give an iteration procedure to determine & and £. 

We shall find it convenient to speak of z; S # as the group I and of x; > & 
as the group II. Then (12) and (13) may be equivalently written as 


(14) & = Median (y; — 8z,;) 
and 
(15) Median;(y; — 6x;) = Mediann(y; — Bx;). 


Test for 8 = Bo. If 8B = By, ais estimated by & = Median (y; — Bor;). Mood 
and Brown consider the number of points, say m, and m:, above the line y = 
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& + Gor in each group. Let us, for convenience, assume that n is even. Then the 
frequency function of m, and mz is given by 


(m:)(m) 
(16) p(m, m,) = 4 


7 ? 
n 
5n 

so that, by the lemma, they obtain 


(17 x = 18 (m - *) ’ d.f. = 1, 


n 


as the test-statistic. It may be seen that the supposition that n be even may be 
relaxed. 

We may arrive at (17) by some heuristic considerations. Assuming n is even, 
as before, we have n/2 points in each group and we note that m, + m: = n/2. 
If the hypothesis is true, we expect m; and m2 to be approximately n/4. Now 
m, is the number of positive y; — & — Bor,’s from the first group and, similarly, 
for m,. Now the y; — a — fox,’s have identical distributions and, also, 4 — a 
‘?’ Oasn— &, so that, on heuristic considerations, 


(mm) 
p(m,, Mz) ~ Ve 


(;.) 


for large n and, by the lemma, we again have the asymptotic x’ statistic given 
by Cag}. 

If we are willing to assume, in addition to (a) and (b), that 

(c) the mean and variance of y exist for any 2, then taking the mean as a 
location parameter given by a + 82, a and 8 can be immediately estimated by 
the usual least squares estimators. In the above case, & = 9 — of, where 7 is 
the mean of the y’s and similarly for . Then also & — a” 0. In this case, if b 
denotes the number of points above the regression line, we have by a similar 


heuristic argument 
3n\(4n 
Me 
p(m , mz) ~ pete 


for large n and where m, and mz, are defined as before. Hence by the lemma we 
have an alternate test-statistic 


(18) x’ = {4n/[b(n — b)]}(m, — 46)’, df. = 1. 


Consistency of & and B determined by (14) and (15). Let 2; = yy — a — Ba;. 
Then the z’s have identical distributions with median zero. Now (15) may be 





854 Vv. P. BHAPKAR 


written as 


(19 Median,{z; + (8 — 8)xi) = Mediany[z; + (8 — 8)zd. 


, . \ . ( ) . °,° ° 
Now as n — o, |Median; (z;) — Median; (2,;)| “> 0, so that intuitively it 


seems that 8 ~ 8 will satisfy (19), that is, |8 — 6| 2s 0. It has not been possible 
yet to give a general proof. We shall, however, give a proof for the case where 
there is a unit of measurement for x. This should cover most of the practical 
cases. 

Proof for the special case. Let €n, @1n, 92n and 8, denote the median of 2’s, 
Median;(z,;), Median;;(z;) and 8 respectively when the sample size is n. Let us 
suppose that (i) z; S 2, form the group I and z; > #, form the group II, and 
(ii) « > ao implies x = 2 + 4, where 4 is a fixed positive number, however small. 
[For example, 6 may be in the nature of a unit of measurement.] 

Since 6,, 5 0 and 62, 2 0, given n, « > 0, there is an m; such that 


(20) \Oen| <¢€ and |@,|<e¢« for n>m, 


with probability greater than 1 — ». Consider n greater than n, . Let 8B — Ba = On . 
Case (1). Suppose @, 2 0. Then 


Median;|z; + (8 — Bn)xi] S Oin + OnEn 


Mediann[z; + (8 — B,)x,] = On + On(En + 5). 


Then (19) implies that 62, + 0.(%, + 6) < Oin + On, , so that 0,6 © On — 
€/6 


62, < 2e, from (20). Hence 6, = |@,| < 2 
Case (2). Suppose 6, < 0. Then 


= ¢’, say. 


Medianj;z; + (8 — Bn)xi] = On — |On| En, 


Mediann(z; + (8 — B,)xi] S O2n — |On| (Bn + 8). 


Again, (19) implies that 0. — |6,|%, S Qe — |0n| (2. + 6), so that |6,\|5 S 
Os, — Oi, < 2e, from (20). Hence, again, |6,| < 2«/6 = «’. Thus, given » and 
e’ > O, there is m such that |@,| < «¢’ with probability > 1 — yn forn 2 n. 

Thus 0, 0, that is, 8, “5 8, and the proof is complete for the special case 
mentioned above. 

Consistency of &. Let us assume that 8 “’. 8. Now & = Median (y; — Bx;) = 
a + Median [z; + (8 — 8)x,]. We shall assume that the x’s are bounded. Sup- 
pose |x,| < M for all i. Also 8 “’s 6 implies that given e, 7 > 0, there is an n* 
such that |8 — B| < «/M for all n = n*, with probability > 1 — 7. Then 

Median (z;) — « S Median [z; + (8 — 8)x,] S Median (z,) + «, forn = n* 
with probability > 1 — 7. Also Median (z;) 0, so that Median 
[z, + (8 — B)aj 2 0. Thus, 4 % a. 

teMARKS. We may decide to take x S x» as the group I and x > 2 as the 
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group II (even though 2» is not the median of the x’s) in the equation (15) to 
estimate 8, where 2x» is chosen suitably (preferably so as to divide the points 
into groups of approximately equal size). Then the above proof, with slight 
modifications, will go through if we assume, instead of (i) and (ii), that all the 
x’s in the group II are greater than or equal to 2» + 6, where 4 is a fixed positive 
number, however small. This would cover almost all the practical problems, 6 
being in the nature of a unit of measurement. 

Then the test-statistic (17) can be modified suitably. Let a and n — a be 
the nuraber of points in the groups I and II respectively. If we define m; and m, 
as before, then m; + m2 = n/2 (assuming n to be even). Then, on similar heuristic 


considerations, 
a\(n-—-a 
mM, Me 


n 
n/2 


° 2 a 
so that by the lemma we have the asymptotic x’ statistic 


p(m,, Mm,) ~ 


(21) x’ = {4n/[a(n — a)]}(m, — 3a)’, df. = 1. 


The supposition that n be even may now be relaxed. 

3.2 c samples. Let us suppose that we have n; independent observations 
(xij, Ys), J] = 1,2, +--+, n¢, from the 7th population, 7 = 1, 2, --- , c. We shall 
assume (a) as before and (b) that the regression is linear, that is, the location 
parameter (usually the median) of y;; given x;;, is aj + Bixi;. 

(i) To test B; = Ba ,t = 1,2, +++, €¢. 

We shall have c independent x’ statistics with 1 d.f. each, giving the x’ statistic 
with c d.f. No new problem is presented here. 

(ii) To test B, = Be = --- = B,. On this hypothesis, y,;’s have medians 
a, + Bxr;;. We may estimate the a’s and 6 by 


& = Median (yi; — Bxi;) 


j=1,2,-++.ng 


Mediani(y;; — & — Bai;) = Mediany(yi; — & — Ba,;). 


group II as x > @, though the test-statistic can be modified to suit other cases. 

Let >-{ n; = N. For simplicity, let us take n; to be even. Let m; be the number 
of points from the 7th sample belonging to the second group and 1; be the number 
of points out of these m; that lie above y = 4; + Bx. Then >-{m; = 3N and 
>i l; ~ EN. If the hypothesis is true, we expect 1; to be 3m; . Let I; be the num- 
ber of observations from the m; in the second group of the ith sample, such that 
24; = Yij — a; — Bay; is > 0. Since 4; — a; ),0 and 8 — B®, 01,—% 20 
as n’s — &. Therefore, heuristically, the l’s have the same distribution for 


For convenience, we shall take group I as x S # (the median of all the z’s) and 
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large n’s as the l’’s subject to >> {l; ~ 4N. Since the z;,;’s have identical distri- 


) 


butions, 


so that 
Dih,l,--: 


Hence, by the lemma we have 


x =4 = we’ (tl, — dm;)” df.=c— 1. 


If some m; = 0, the corresponding term will be absent and d.f. will be reduced 
by one. Of course, as the referee has pointed out, if some m; are small the ap- 
proximation would be questionable. We could have considered the group I 
instead of the group II. It may be seen now that the condition n; even may be 
relaxed. 

If we are willing to assume, in addition, (c) as before, then we may take the 
least squares estimates 


De De (ys — Gi) ais 
8B = ~~ = 
>, 2, (tz — 4)" 
i j 
so that 4; “’s a; and 8 “’, g. If 1; denotes the number of points from the ith 


sample above the corresponding regression line and >>; 1; = 1, then by a similar 
heuristic argument, 


p(h,---,)~ : 


for large n’s so that by the lemma, 


ee ad muses 
Y= apy eel 7 d.f.=c —1. 


ie te 


(iii) To test a, = a2 = -++ = a, when B, = B: +++ = 6. On this hy- 
pothesis, y;,’s have media: s a + 82x;;. a and 8 may be estimated by 


& = Median (y;; — &z;;) 
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Median;[y;; — Bx; ;| = Mediann[y;; — Bx;;\, 
_ 


where, for convenience, we take groups I and II as x % (the median of all 
the x’s) and x > # respectively. Let N be even and 1; be the number of points 
in the ith sample above the regression line y = & + 62. If the hypothesis is 
true, we expect 1; ~ 4n;. We note that > {1; = 4N. Let I; denote the number 
of positive terms in y;; — a — Baxi; (j = 1, 2, --- , n,). Since & ®, aand § %, B, 
l; — | ®.0as N — o. Hence by similar heuristic arguments, the distribution 
of the /’s for large N is approximately the same as that of the /”’s subject to 


3 i; = iN. Hence, 
Di, le, ++: 


for large N so that by the lemma, 
(23) 


If we are willing to assume, in addition, (c), that is, the existence of the mean 
and variance, then we can have the least-square estimates & and 8, such that 
a ( z ( c ° ° 
a, wand 8 ®,s 8. If we denote >°{1,; by d, then by the same heuristic argu- 
ment 


for large N so that 


2 N’ <1 
ra = aii Po l; 
x d(N — d) 2X Nn; ( 


We shall indicate here briefly a formal proof for (22), which was first derived 
on heuristic considerations. 
Let ui; = yi; — Bri; . Then 
= number of positive y;; -— & — Bai;(j = 1, 2, --- , n;) 


= number of w;;’s (j = 1, 2, ---, i) > & = Median,;,; (u;;). 


Also >-{1; = 3N. Let z. be the ath (a = 3N)u,; in magnitude. Then the joint 
frequency function of 1, , --- , l. and z,, under the hypothesis, is 
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> DF u,( meee Fu,,—1, (4a) [1 ~ Fu, 


i=l 


(Za)] aie {1 am Fu, Za)] 


im a4) 


; F ii, (2a) 7. F iin 1,1 (2a) [1 _ F, i (2 )] st (1 — F; n (2a)] 


l “a 
lo+ 
s+! 


(24) 
dF ;:, _1,(20) oes F ext 7 Fcc, 1.\2e {1 inks | ee Zz. )] 
me — F.. <a )I, 
where F;;(z.) = Pr ony < 2z,|, the 7th term indicates that z, is from the ith 
sample ood 2 denotes the sum over all possible combinations. 
Since 8 2, B, given e, 7 > 0, there is an No such that, for N = No, |8 — Bl < « 
with probability > 1 — ». Then for N = No, with probability > 1 — n, we 


have Pr [y;; — 8xi; S za — exij] S Fij(za) S Pr [yi; — Bri; S Za + €2%;;], that is, 
F(2a — e@i;) S Fij(za) S F(2a + xij), 


where F denotes the distribution function of all y;; — 82,; . In view of the con- 
tinuity of F, 


F5;(@a) = F(za) + 6:;, 


where the 6’s are arbitrarily small and tend to zero as n — 2». Then (24) becomes 


> > F""(z,) (1 — F(za)]'* «+» P™ 8 "(2,) [1 — F(2q)]'* dF(2.) 
i=] 


 F"(2a)[1 — F(za)}" + 0(8) = DD PHM 2,)[ — Fe) 


i=l 


(a) +008) = $5 (1) (ted) mt _—_ (na) 
P ) os O 8) 2, (7 Lj-y 1; 1 (n; — -L- = 1)! Liss 


; (‘") F**™ (2,)[1 — F(za)]** dF(z.) + 0(8). 


On aS out z, we have the joint frequency function of l,, ---, i. 


! 1 . 
i=] i ( ee ee . c 0 


= le ya B(4N, 4N +1) D> (ni — 1) + 0(8) 
1 i 


i) 
(;) /) 


which is the same as (22). 
(iv) To test 8 = 0, when 3, = & = --: = B, = B, say. On this hypothesis, 
y;;’8 have medians a,’s. We may take 


+ 0(8); 


a; = —— (Yi). 


j=1,2,> 


For simplicity let n; be even. Then 3m; points from the ith sample are above the 





SOME NONPARAMETRIC MEDIAN PROCEDURES 859 


corresponding line. Also 4$N points are to the right of Z, the median of all the 
x’s. Let 1; be the number of points from the ith sample to the right of # and 
above the corresponding line and let 1 = }0{1;. We expect, then, 1 to be 3N. 
Let m,; and m be defined similarly for z < #. Then, by the same heuristic argu- 


ment, for which a formal proof could be given as in (iii), we have 


-“ 3N 
l m 
N 
AN 


for large N and, hence, by the lemma we have 
x = (16/N)(l — 34N)’, df. = 1. 
The condition that n; be even, then, may be relaxed. 

3.3 Testing linearity of regression. As in the normal analysis, it is necessary 
that we have a number of observations for each z;. Let the observations be 
(ti, Y¥ij),J = 1,2, °-+,n¢,% = 1,2,--- , k. We shall assume that the distribu- 
tion of y, given 2, is continuous and the same apart from location, say h(z), 
which may depend on x. We want to test the hypothesis that the “regression”’ 
is linear, that is, h(x) = a + Br. Let >in; = N and these N observations be 
divided into two groups, say x S 2, forming the first group and xz > 2, forming 
the second group, as evenly as possible. Let us suppose that observations cor- 
responding to x;(t = 1, 2, --- , k,) belong to the first group and the rest to the 
second. Let the groups contain a and N — a observations respectively. We 
may then estimate a and 8 by Median (y;; — & — 82z;) = 0, and 


p(l,m) ~ 


Median:(y;; — Ba;) = Mediani(y;; — &z;). 


Consider the n; observations corresponding to z;. If the regression is linear, 
we expect these n, to be split evenly by the regression line y = & + &r. Let l;, 
out of these n;, be above the line. We expect 1; ~ 4n;. Then Dea li; = a 
and > > ra l; = 4(N — a), assuming for convenience that a and N — a are 
even. 
Let 2;; = yi; — a — Bx;. Then on the null hypothesis, the z;;’s have identical 
distributions. Let I; be the number of positive terms in z2;;(j = 1, 2, --- , ;). 
Since & ‘2s a and 8 *s g, 1; — I; & 0. Hence on heuristic considerations as 
before, the distribution of the /,’s is the same (asymptotically) as that of the 
l’s subject to >“44,1; = Ja and >-4.4,4,0; = 4(N — a). Thus, 
(a, () 
pl, le, «bk i.) ~ at NG Jin \i / 


Guano) 


so that by the lemma, 
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k 

1 an 
— (1; - oni) ’ 
ky+1 1; 


whence 
= 1 
4>,— (ik — ny)’, 
1 nN; 


4. Some bivariate problems. 
4.1 One-way classification. Let there be n; independent observations (x;; , y;), 
, n;, from the ith population, 7 = 1, 2, --- , k, and let a n, = N. 


j = 1, 2, en 
Suppose F(z, y) denotes the distribution function of (X, Y) for the ith 


population. We shall assume that 

(i) the F’s are absolutely continuous, 

(ii) the distributions are identical except for location, and 

(ili) the median of the conditional distribution of Y, given X, is a linear 
function of X. We note that the conditional probability, given X, is also a prob- 
ability measure almost everywhere. Let fi(x, y) and f;(x) denote the densities 
of (X, Y) and X respectively for the ith population. Also, in view of (ii) 
(25) F(z, y) = F(x — &,y — ni). 
We want to test whether the populations are identical. Thus 


£ 


> $1 £> 


- 
(25) implies fi(z, y) = f(a y — ni) so that f;(z) 
implies that f(z, y) = g(x)h(y — Bx), so that 
f(x — &,y — ni) g(x — &)hly — ni — @ — B(x — &;)] 
g(x — &)h(y — a; — Ba), 


say. Thus we see that 
Hoh =& = 


It may be noted that we have relaxed just the normality of the distribution, but 
retained other features from the classical set up. 

We shall use a step-down procedure to test Ho. A step-down procedure for 
Hy with a level y will be a test for 


He:i=&=--+ = & 


with a level 7; , and if Ho, is not rejected, a further test for 


Hoyjz 2 01 = a2 = +s = 
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with a level y2 , where 7; and yz are chosen suitably so that 


(1 —-y = (1 —m)(1 am Me). 


The test for Ho,), will be derived from the conditional distribution of Y, given 
x, so that the x’s then can be regarded as fixed. 

For Ho, , we consider only the z’s. Let us consider the test given by Mood 
[10]. (We could have used either Kruskal’s test or the test proposed in [2].) 
Let m,; denote the number of observations in the ith sample greater than the 
median of all the x’s. Mood shows that the frequency function, if Ho, is true, is 


k 
IJ (":) 
mM; 
(26) p(m,,m,,°*> ,m,) = — + 


where a = 4N if N is even or 3(N — 1) if N is odd. The test-statistic proposed 
by him for large N is 


2 N(IN—-1)<—1 nia\* ie 


For small n’s, the probability is computed from the exact distribution (26). 
The test for Ho,\z is seen to be precisely the same as that considered in 3.2. 
Hence we may take (23) (in its modified form) as a test-statistic, if the con- 
dition mentioned in the remark holds good. As already stated, it may be possible 
to prove that 4 a without using the condition, in which case (23) may be 
used for large samples in general. 

4.2 Two-way classification. For simplicity, we shall consider only the case of 
one observation per cell, when the design is complete. Let ‘7’? denote “treat- 
ments” and ‘“j”’ denote “blocks”. Suppose i = 1, 2,---,t,7 = 1, 2,---,b 
and N = bt. Let F;;(x, y) denote the distribution function of (X, Y) for the 
(7j)-th cell. We shall assume that 

(i) F;;(a, y) is absolutely continuous, 

(ii) the distributions are identical except for location, that is 


F;;(2, y) = F(x — ai, y — Bij), 


(iii) the model is additive, that is, 
aj=& +n; and 8; = 74+ 4;, 


(iv) the “regression” of Y on X is linear. 

As before, we notice that we have relaxed just the normality of the distribution 
while retaining other features of the classical set up. 

Let fi;(2, y) and f;,(x) denote the densities of (X, Y) and (X) respectively 


for the (77)th cell. Then f;;(2, y) = f(a — ai; , y — Bis) and fi;(x) = fi(a — aij), 
say. Also 
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f(a, y) = filx)fe(y — a — Bx), so that 
fis(x, y) = fila — ais) folly — Bi; — a — B(x — aj;)] 
=fi(x — & — n;)fely — a — yi + BE: — 6; + Bn; — Bz). 


We will be interested in the usual hypothesis 


We shall consider a step-down procedure to test Hy. Considering the 2’s sepa- 
rately we can test, ata level a, 


Hou > & = 


by the criterion given by Mood [10], 


»s t#-1) < 
‘= ; = — a. £. = t —— l, 
x ba(t — a) 2 (m 


where a = }t if ¢ is even or 3(f — 1) otherwise, and m; = the num- 
ber of z,;’s(j = 1, 2,---, b) greater than #;, the median of the jth column. 
Then, considering the conditional distributions of y;;’s given 2;;’s we have to 
test 


(27) Hoy\2 : yi;’8 have medians A; + 62;;, 


at a level as, so that (1 — a) = (1 — a)(1 — ae). We may estimate A; and 
8 by 


\; = Median (yi; — Bxi;), 


t==1,2,--+,¢ 


and 
Median;(y,; — _ Bx;;) = Mediann(y:; — 


where the groups are with respect to the x’s as usual. We note that a, defined as 
above, out of t y:;; — 4; — &zxi,;’s for each j are positive and hence in all ab out 
of bt yi; — 4; — Bx;,;’s are positive. Let 1; denote the number of positive terms 
out of by:; — 4; — Bax;, for given i. Then we expect 1; ~ 4b if (27) is true. 
Also 2 ae l; = ab. Let I; denote the number of positive terms out of b yi; — A; 
— 6x;;, for given i. On heuristic considerations, for large samples \; ~ 4, 
and 8 ~ §, so that the distribution of the I’s is asymptotically the same as 
that of the l’’s subject to >>} 1; = ab. Hence, 


Plh,k,-:: 
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for large bt, so that by the lemma, 


2 
x 


(28) 
d.f.=t—1. 


The same remark as that at the end of 4.1 will hold good here. Also, it may seem 
that we require ¢ large (since we require \; ~ }; in the above argument), but 
if we give a formal proof, similar to that given in 3.2 (iii), we note that B ~ 8 
is sufficient to reduce the proof to the one given by Mood. This does not require 
large ¢ but only large bt. Hence (28) gives a test-criterion for large b. 
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ON CERTAIN CHARACTERISTICS OF THE DISTRIBUTION OF THE 
LATENT ROOTS OF A SYMMETRIC RANDOM MATRIX 
UNDER GENERAL CONDITIONS'! 


By H. R. vAN DER VAART 


Leiden University, Netherlands 


1. Summary. Under certain conditions, to be specified in Theorems 2 and 4, 
the latent roots of the symmetric random matrix F with &F = ® are biased 
estimators of the latent roots of ®; the smallest (largest) root is negatively 
(positively) biased. Here bias includes both expectation-bias and median-bias. 
Further properties of the distribution of the latent roots are given, among them 
some relations between covariances of the latent roots, covariances of elements 
of F, and the amounts of expectation-bias of the latent roots. Also, a sufficient 
condition is given for a certain type of symmetry in the joint distribution of the 
latent roots. For applications of the theory presented in this paper to the theory 
of response surface estimation see van der Vaart [9]. 


2. Introduction, notations, definitions, statement of the problem. We shall 
use Latin letters for random variables, Greek letters for parameters, small letters 
for real-valued variables, capital letters for square matrices (examples: f;; , 
lh , Ugh, Ven Teal-valued random variables; F, L, U, V random matrices; ¢;; , 
Aa, Ugh , Ya Teal-valued parameters; ®, A, T, [ matrices consisting of parameters). 
Let u,, be an element of a matrix U, then u., will be used for the hth column 
vector in U so that u., isa k X 1 matrix whose transpose, uv, , is the hth row of 
U’. Finally, a symbol like &(F'), the expectation of a matrix, will denote the 
matrix of the expectations, 


(2.1) &(F) = |l&f;;\| . : 
f J = 1 eee k 


where the superscript 7 = 1 --- k denotes row numbers, the subscript 7 = 1 --- k 
denotes column numbers (a good example of the usefulness of this compact 
notation is the defining equation (4.21) in Section 4). 

Now consider a k X k matrix M with real elements m;;. If a probability 
measure is defined in J, the set of possible k’-tuples (my ,°°*, Mx), then 
the matrix M will be called a random matrix. This probability measure is singular 
(relative to k’-dimensional Lebesgue measure in I), if a subset of M exists with 
Lebesgue measure zero and probability measure one (cf., p. 30 of Saks [6], 
or p. 611 of Doob [2]). This subset, defined up to a set of k’-dimensional Lebesgue 
measure zero, will be denoted by the term M-space, and the probability distri- 
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bution over M-space will be denoted by the term (probability) distribution of 
M. 

Examples of singular probability measures in Jt: M may be prescribed to be 
orthogonal (M-space is then $k(k — 1)-dimensional) or symmetric (M-space is 
$k(k + 1)-dimensional). In the latter case, M-space could be represented as a 
subset of It by the equations 


(22) m=ts (7 24;8 = 1, +--+, &); m;=ts (9 <t;t=1,---,k). 


We shall, however, define M-space to be the projection of this subset on 
(my, °** 5 Me, Me,***, Mx, *** y Merke+r, Merit, Myx)-Space. We shall 
call the distribution of a k X k symmetric random matrix continuous if it is ab- 
solutely continuous relative to $k(k + 1)-dimensional Lebesgue measure on 
mi (1 S%Sj S k)-space. 

Two random matrices M’ and M” are called (stochastically) equivalent, 


, 


(2.3a) M’' ~ M’, 


if M’ = M” with probability one (cf., p. 33 of Kolmogorov [4]). Let f(M) be a 
real-valued function of the matrix M; if M@’ and M” are equivalent, then 


(2.3b) &f(M’) = &f(M”), 


provided these expectations exist. 
Our problem now is this. Let ® be a real symmetric k X k matrix. Let F be a 


real symmetric k X k random matrix, which is continuously distributed and 
satisfies 


(2.4) &(F) = &. 


Then the k latent roots, |, of F and \, of ®, are real (h = 1, --- , k). Assign 
the subscripts in such a way that 


(2.5) hsekhs---S3h;mM8% --- Sd. 


Define diagonal matrices L and A by 


: wre ung =1leeck . ng=il---k 
2 ; *  $ 
(2.6) oa Lscch® A Aodorll p21... b* 


Our aim is to investigate the distribution of L. Note that, although two or more 
roots \, may be exactly equal, the probability that two or more roots J, be 
exactly equal, is always zero, since F is continuously distributed. 


3. Definition of a few important matrices. As is well known, one can always 
construct a random orthogonal matrix U with real elements such that FU = UL. 
Here the column wu., of U is the latent vector (= eigenvector) corresponding 
to the latent root 1, . If two or more latent roots are equal, J,, = Ih, = --- = ly, 
say, then the columns w.,, , U.a,, °** , U-ng Of any orthogonal U with FU UL 


are a basis for the eigenspace corresponding to the latent root J,, = --- lng « 
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What we have said just now, may be repeated with T, ®, A, v, \ instead of U, F, 
L, u, l. We list the following for reference: 


(3.1a) @ = TAT’, F ULU', 
(3.1b) fT = TA , FU UL, 
(3.1¢) TéT = A, U'FU = L, 
(3.1d) T@ = AT, U'F = LU’. 


As the probability that two or more latent roots of F be equal, is zero, the 
columns u., of U are uniquely defined with probability one, except for their 
sign. As to this question of signs, for the purposes of the present paper it is 
sufficient to observe that it is possible to restrict this ambiguity in such a way 
that Det (U) = +i throughout, and that, with L fixed and U variable, the matrix 
F = ULU’ runs just once through all values F that have the same matrix L 
of latent roots], S --- S i, . If two or more latent roots of ® are equal, then in 
order to represent @ by TAT’ it will suffice to choose just one orthogonal basis 
in the corresponding eigenspace. 

U and T being defined, we will now define V by one of the equivalent relations 

V = TU, TV =U, T = UV’, 
V’ = UT, VT’ = U’, T’ = VU’. 


(3.2) 
Evidently V is an orthogonal random matrix with Det (V) = +1. 
Finally, define 
(3.3) L = VLV’. 
Equations (3.1c), (2.4), (2.1), (3.2), (3.1¢) show that 
(3.4) A= TOT = 1’-8(F)-T = &(T’FT) = &(VU'FUV’) = &(VLV’) = &(L). 


Hence L would be an unbiased estimator of A if it were a statistic. Unfortunately, 
though U is a statistic, T is not. As T is usually unknown, V is not a statistic, 
nor is L. Note that, whereas the off-diagonal elements of L equal zero by defini- 
tion, the off-diagonal elements of L do not. Therefore one ought to write i. 
for the diagonal elements of LZ. Yet, because of the analogy between L and L, 
it will sometimes be convenient to write /, for l,,(g = 1, --- , k). Furthermore 
note that /, may well be larger than /, , whereas by definition , < |, . 


4. On certain characteristics of the distribution of L and L. Equation (3.4) 
suggests that L may have certain undesirable features as an estimator of A. 
We are going to investigate the distribution of L (and L). We will adhere through- 
out to the following assumption. 


AssuMPTION A. The symmetric random matrix F is continuously distributed,? 
and &(F) = ©. 

? The phrase ‘‘continuous distribution of a symmetric random matrix’”’ has been defined 
in Section 2 as meaning absolute continuity with respect to a natural measure. 
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It is easy to prove the following theorem. 
THEOREM 1. 


(4.1) & (tr L) = & (tr L) = tr A = tr® = & (tr F). 


Proor. These equalities follow immediately from (3.4) and the following 
properties of traces: 


(4.2) tr &(F) = & (tr F), tr (AB) = tr (BA). 


For instance, tr (L) = tr (VLV’) = tr (V’VL) = tr (L); hence & tr (L) = 
& tr (L) = tr &(L) = tr A, ete. Proof completed. 

Next we want to prove that L is a biased estimator of A in various respects. 
The following elegant argument was suggested to me by T. W. Anderson in a 
personal communication (February 1958). It is based on the following well 
known extremal property of the latent roots (see, for example, Satz 10, p. 292 
of Gantmacher, [3]) : 


(4.3) lL, = max (a’Fa) 2 vnF v-g , 


where the maximum is taken over all vectors (= k X 1 matrices) a with a’a = 1: 
hence the inequality (4.3) holds for all vectors v., in the eigenspace corresponding 
to the latent root \, of &. Inequality (4.3) yields 


(4.4) El, = vn &(F) -v% = i m= d,. 
A similar argument yields that &l,; S »,. 


Another consequence of (4.3) is that Med , 2 Med (v:,.Fv.,), and hence that 
(4.5) Med], =>, if Med (v.Fu.) = %- 


A similar argument yields that Med , S \, if Med (vyFv.) S 1. 

In order, however, to investigate conditions under which the inequality (4.5) 
is strict, and as a preliminary to a more detailed study of the distribution of L, 
we shall indicate a different method of proof. First we shall prove a lemma. 

Lemma 1. Under the general assumption A 


(4.6) P(t, am l; > 0) = Rs P(I, om L. > 0) = }. 


a k 2 k 2 
Proor. As V is orthogonal, we have that }-h-10i, = 1, Doher vin = 1. 
Hence, because of (3.3) we have 


k k 
(4.7) ,-—l = ze vin — lL), L—t, = > vial, — kh). 
hel h=l 


This shows that /, — 1, and I, — 1, are essentially non-negative. Hence we need 
only prove that 


We shall write down the proof for the first of these two equalities. The equation 
F = ULU' = TVLV’T’ (cf., (3.1a) and (3.2)) parametrizes (for this term see 
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e.g., p. 246 of Busemann [1]) 3k(k + 1)-dimensional F-space (F is symmetric) 
in terms of the Cartesian product of $k(k — 1)-dimensional V-space (V is 
orthogonal) and k-dimensional L-space. Now (4.7) shows that l; — = 0 
if and only if ; = --- = 1, < Ij4, and 0 = m54: = --- = vu(g = 1,---, k). 
Hence /,; — 1, = 0 defines a union of subsets of the Cartesian product of V-space 
and L-space of dimension lower than $k(k + 1). In consequence the image of 
l, — l, = 0 in F-space has 4k(k + 1)-dimensional Lebesgue measure zero, and 
as the probability distribution over F-space is assumed to be absolutely con- 
tinuous with respect to this Lebesgue measure, it follows that P(/, — = 0) = 0, 
(ef., equation (3.5) in Busemann [1]). 

Theorem 2 is a trivial consequence of Lemma 1. 

THEOREM 2. Under the general assumption A 


(4.9) &(1,) <. Mi; &(I.) - Nx ; &(l, “_ l) > Ax - Ay ° 


Proor. \, — &(,) = &(l, — h) = f(4 — h) dP > O by (3.4) and the first 
part of Lemma 1. A similar proof holds for the remaining two inequalities in 
(4.9). 

Note that the amount of expectation-biases such as \; — &(l), is a function 
of the distribution of F. If this distribution is specified, the expectation-biases 
can be calculated from formulae such as 


k 
(4.10a) Ay om &(l) = &(1, — lL) = E[ >> vin(la - L,)] 
h=1 
(ef., (4.7)), which in case k = 2 simplifies into 
(4.10b) 1 — &(hL) = Slvie( — h)). 

The next theorem will contain a result closely related to distribution-bias 
(cf., van der Vaart [7]; it is not really distribution-bias since ], is not a statistic, 
see Section 3 of the present paper). Its proof depends on a simple lemma (see 
Section 4 in [7]), of which we shall cite a slightly altered version for easy refer- 
ence (v without subscripts bears no relation to v with subscripts) : 

LemMA 2. Whether the random variables t and u are independent or dependent, 
if 
(4.11a) P(u>v) = 1, 
then 
(4.11b) Pit<r)=2P(tt+us 7+ 0). 


A necessary and sufficient condition for equality of the two sides of (4.11b) is 


(4.11c¢) Pl(t+tu>rt+tu)f (t<7)j) =0. 


Under the weaker condition P(w = v) = 1 either the first inequality sign in 
(4.11b) and the second one in (4.11c) should be replaced by 2, or the third 
inequality sign in (4.11b) should be replaced by < and the first one in (4.11c) 
by 2. 
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We shall denote as Lemma 2’ the result obtained by reversing all six inequality 
signs in Lemma 2, except the second inequality sign in (4.11b). 

TuHEorEM 3. Under the general assumption A 
(4.12a) P(l, < r) = P(l, S r) for any (real) r, 

(4.12b) P(, > r) = P(k, = 1) for any (real) r, 

(4.12c) P(, —l > 7) = P(l. — h = 1) for any (real) r. 
Necessary and sufficient conditions for these inequalities to be strict are 
(4.13a) P((h > r)N (h < r)] > 9, 

(4.13b) P{(l. < r)N (kh > r)] > 9, 

(4.13¢) P((, —h <7r)N (k-—h>1)] >0, 
respectively. 

Proor. Because of Lemma 1 we can apply Lemma 2. In Lemma 2 replace 
vu by 0, t by  , u by (4, — lL) to obtain (4.12a) and (4.13a). Again in Lemma 2’ 
replace v by 0, t by , , u by (, — l,) to obtain (4.12b) and (4.13b), and v by 0, 
t by (i — h), u by (i. — l, — + 4) to obtain (4.12c) and (4.13c). 

The application of conditions (4.13) is easy if every subset of F-space, hence 
of the Cartesian product of V-space and L-space, which has a positive 
sk(k + 1)-dimensional Lebesgue measure, has at the same time a positive prob- 
ability (such is the case if F is distributed according to a nonsingular multi- 
normal distribution): then all one has to show is that in (V-space) X (L-space) 


points exist in which both l, > +r andl, < 1, ete.; see pages 14 and 15 of van 
der Vaart [8]. 
TueoreM 4. Jf the 4k(k + 1)-variate probability density function of the 


fii Si Sj Sk) ts symmetrical with respect to the point with coordinates ¢,; = 
6f;)1 sisj sk), then l,, l., andl, — k are negatively, positively and 
positively median-biased estimators of 1, \% , and % — 1, respectively. 

Proor. In inequalities (4.12) put r = 41, 7 = A, T = \& — Ar, respectively. 
Then proof will be complete if 
(4.14) P(l Sm) = 3, P(l, =) = 3, P(i, — l, = & — &) 
Now by (3.3), (3.2) and (3.1c) we have that 


s 


L—A= TFT — TOT = T'(F — ®)T. 


/ 


i — ue eal —O)van,4 —h—-m+tr 


= i Dos (vavy tT vaya) (fas — @Gij)- 


These expressions when equated to zero represent hyperplanes in F-space which 
contain the center of symmetry. Hence (4.14) holds true. 
The argument in the paragraph preceding Theorem 4 immediately yields a 
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class of distributions, including the normal, of the matrix F for which median- 
bias is strict. 

We want to emphasize the importance of the role of Lemma 1, i.e., essentially 
of assumption A, in our proofs. If, for instance, for the 2 K 2 matrix F the prob- 
ability P(f2 = 0) = 1, and P(fu >c) = P(feo <c) = 0 (this example stems 
from T. W. Anderson in a personal communication (February 1958)), then 
P(L = L) = 1 and all former deductions are invalid. Note that in this example 
our assumption A is violated: the distribution of F is not absolutely continuous 
relative to 3k(k + 1) — dimensional Lebesgue measure. 

Finally we shall give a few results concerning various moments of |, and f;; . 
We shall use the following equalities (where M* = MM for any matrix M) 
in the proofs: 

(4.15a) tr L = tr (VLV’) = tr L = tr (U’FU) = tr F, 
(4.15b) tr (L’?) = tr (VLV'VLV’) = tr (VL’V’) = tr (L’) 

= tr (U’FUU'FU) = tr (U’F°U) = tr (F’), 
(4.15¢) tr (@) = tr (A’) = tr [6(L)-8(L)]. 


Proofs of these equalities are easy: apply (4.2), (3.3), (3.1¢) and (3.4). 
TuHeoreM 5. Under the general assumption A we have for the sum of mean square 
errors 


(4.16) x &(l, — v3) = >» var |, = bm var fi; ; 
uv t2 


gh 


for the sum of variances 
(4.17a) > var l, = > var fis + >. — Dd (El,)? 
g tJ g w 
(4.17b) = > varfii+ Do ¢i; — Dd (8l,)’; 
tJ g 


+72 


for the sum of covariances 


(4.18) > cov (l,l) = >> cov (l,,h) = & cov (fi, fis). 
tJ 


g,h gh 


Here >> ,,; stand for }-4., 5.41, 5, for St. 


Proor. Because of (4.15b) and (4.15c) 
Doig var fiy = Doss 8(fi;) — Does (8fs)” = & tr (F*) — tr (@) = & tr (L’) 
— tr [8(L)-s(L)] = 2) var lL. = & tr (L’) — tr (A’) = > 6(8 — Xj3). 
gf" g 
Equation (4.17a) follows immediately from (4.16); (4.15c) then shows the 
equivalence of (4.17a) and (4.17b). Finally apply (4.15a) to prove 
> 2 cov (l,,1,) = & (tr L)® — (&tr L)? 
= &(trL)* — (etrL)’ = & (tr F)® — (& tr F)’, 


which serves to prove (4.18). 
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As for the consequence of this theorem concerning response surface theory see 
the discussion round equation (14) in van der Vaart [9]. Here we repeat only 
that if k = 2 equation (4.17a) yields 


(4.19) > var i. == > var fi; = Qa — 2(r2 — Ai) ag : 
Q tJ 


where ag = &(l2 — Ax) = —&(l, — 1) > O. This equation is important in theo- 
retical work: in theoretical investigations the data of a problem frequently are 
Ai, Ae, var fi; . From (4.19) it appears that in cases where var |, = var I, , the 
knowledge of the first order moment ag is sufficient to calculate the second order 
moment var |, . Hence the problem arises to find conditions under which var |, = 
var l,(k = 2). We replace this problem by the following: to find conditions 
under which a kind of symmetry exists in the distribution of L such that var l, = 
var I, . 

Let q denote the joint probability density of  , --- , 4. A type of symmetry 
which suits our purpose is defined by 
(4.20) ql, le, °:,&) = q(y —_ ke, ¥ - eee 6 a lL). 
For, in the first place, because of (2.5) q should be zero except for 
(q) 48348::-34h; (6) y-hSy-bis:::37-h. 


Conditions (a) and (b) coincide. In order to show other useful features of a 
distribution of L satisfying (4.20) we introduce the matrix 


h=l1---k’ 


(4.21) *7, = Ly -41—n5 ga! 
cf., the definition of L and A in (2.6). Now the best way to describe the type of 
symmetry determined by (4.20) is by means of 
( 4.22) L ~ yl “= ad F 


(J is the identity matrix; the sign ~ was defined in (2.3a)). As the expectations 
of functions of (stochastically) equivalent matrices are equal (cf., (2.3b)) we 
find from (4.22) that 6&(L) + &(*L) = yI, whence 


(4.23a) El, + El41—9 = ¥7 (g a |] --- &). 
From the definition of *L, (4.23a) and Theorem 1 we find 
(4.23b) 2strL=2étr*L =2trA = 2tré = hy. 


Now consider the direct product of L and L (cf., p. 81 of MacDuffee [5]); as 
L = L, we need not distinguish between left and right direct product. Evi- 
dently 


L@®L~ (yI — *L) ® (yI — *L). 
Taking expectations and subtracting 


(SL) ® (&L) = [8&(yI — *L)] ® [8(y7 — *L)] 
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we find 
SIL @® L — (&L) @ (&L)] = &[*L @ *L — (8*L) ® (&*L)], 

whence 
(4.23c) cov (Iq, , log) = COV (lesi—g, » Leat-o2) (Gi, g2 = 1°-: k). 
so that for g: = go = g 
(4.23d) var l, = var l4i-, (g=1---k), 
i.e., the equality, which prompted this part of our investigation. We shall now 
give a condition on the distribution of F, sufficient to ensure the equivalence of 
L and yI — *L. This condition will ensure a fortiori that (4.23d) holds true. 

THeEoreM 6. Let F be a real symmetric random k X k matrix with &F = ®. 
Then in order that L ~ yI — *L (L and *L consisting of the latent roots of F 


according to (2.6) and (4.21) ), it is sufficient that some pair of real, non-singular, 
random or non-random, k & k matrices, M, and M2, exists such that 


(4.24) M;z'FM, ~ yI — Mz'FM:. 
This condition entails that 

(4.25a) 2tr&(F) = 2tré = ky 
and in case M, and Mz are not random 


(4.25b) >> visl{ MT} of Mi} jn + { Mz") 9:{ Mo} sa] = 7 * Son (9g, h = 1 rer k) 


tJ 


Proor. Denote the functional relation which to any matrix assigns the matrix 
of its latent roots (ordered according to increasing magnitude) by WV; then 
L = ¥(F). It is well known that under the conditions of the theorem 


(4.26a) W(Myz'FM,) = W(F) = L. 


Likewise, sincey — Sy —Li1S +--+ Sy — l are the latent roots of yJ — F 
ifl; S --- S |, are the latent roots of F: ¥V(yJ — F) = yI — *L, whence 


(4.26b) W(yl — Mz'FM,) = W(yI — F) = yI — *L. 


Since V is continuous, comparison of (4.26a) and (4.26b) proves the first part 
of the theorem. Equation (4.25a) coincides with (4.23b) and may, of course, be 
proved directly from (4.24). Equation (4.25b) holds because the expectations 
of corresponding elements of equivalent matrices are equal (the symbol {A}; , 
A any matrix, stands for a,;;). 

In conclusion we want to give an application of Theorem 6. Take k = 2, 


RE see eRe 
1 0 Then MM, F Mz = i—fe fu 


to Theorem 6 we have that L ~ yI — *L if 


M, = 1, M, = = *F, say. According 


(4.27) Fw~yI — *F. 
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This equivalence has some interesting consequences with respect to the first and 
second order moments of the distribution of F. By an argument exactly like the 
argument leading to the various equations (4.23) we find &(F) + &(*F) = y/, 
&[F @® F — (8F) ® (&F)] = &[*F @ *F — (&*F) @ (&*F)], whence 


(4.27a) &fu + Efe = vy, 


(4.27b) var fu = var foo . CoV (fu , fiz) = —CcOovV (fie . fee). 


There are no consequences of (4.27) with respect to &fi2 , var fiz , cov (fu, fee). 
As a corollary, if F is normally distributed, (4.27a) and (4.27b) are sufficient 
in order that g(i , 2) = q(v — kh, vy — h), whence var |, = var kh. 


Acknowledgment. My thanks are due to Professor Dr. R. J. Hader of the 
Institute of Statistics, North Carolina State College of the University of North 
Carolina, Raleigh, for his suggestion that it might be interesting to investigate 
some.questions in the theory of response surface estimation. 
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THE DISTRIBUTION OF NONCENTRAL MEANS 
WITH KNOWN COVARIANCE 


By Aan T. JAMES 
Yale University 


1. Summary. The noncentral means distribution for infinite error degrees of 
freedom and the noncentral Wishart distribution are derived as expansions in 
zonal polynomials. A method of calculating the zonal polynomials is outlined, and 
orthogonality properties of their coefficients stated. 


2. Introduction. Let the k X n matrix variate X, with k S n, be distributed as 
(1) dF(X; M,Z) = (2n)™"|z/ etr {-4[27(X — M)(X — M)'}IJ] axe 
where etr (A) = exp (tr (A)) and thus 

E(X] = M, 
Cov (Zin, Zire) = 60:3, », »» = 1, -°> 
> = (0;;). 


We shall find zonal function expansions of: 
1. the noncentral Wishart distribution which is the distribution of 


S = n'(XX’); 


2. the noncentral means distribution when the covariance matrix is known, 
i.e., the distribution of the latent roots |; , --- , |, of the determinantal equation 


(2) S — 1z| = 0. 


This is the limiting case of the general distribution as the error degrees of free- 
dom tend to infinity. It should not be confused with the asymptotic results of 
Hsu [10, 11], who considered the asymptotic distribution, as the covariance 
matrix of X tends to zero. The distribution considered in this paper is a gen- 
eralization of the noncentral x’ distribution. Hsu’s results are a generalization 
of the normal approximation to the noncentral x’ distribution. 

If Y = zy, then the density of Y is dF(Y; =, I,) andl,, ---, are 
the latent roots of n “YY’. 

The central distributions (M = 0) were found for cases 

1. by Wishart [23] as 

ni }(n—k—1) bart : 
(3) Ink _4k(k—1) jqpdn ‘1 ‘ x |i etr(- = s) OE 
yi >| I[r(}in+1-1) 
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2. by Fisher [5], Hsu [9], and Roy [19] as 
4nk_ $k 
nn © 


Pi Geor iG) 


t=] 





(4) 


em (-3n Du) Me TI Gayla. 


The noncentral Wishart distribution has been studied by T. W. Anderson 
and Girshick [2] and subsequently by Anderson [1], who obtained it for the case 
in which the rank of M < 2. M. Weibull [24] derived the noncentral Wishart 
distribution for rank 3. In previous papers [13, 14] I obtained a power series 
expansion for the distribution, but for some purposes, notably for deriving 
the noncentral means distribution, a zonal function expansion is preferable. 
Herz [6] has shown that the function entering into the distribution is a Bessel 
function of matrix argument, which he has studied in relation to hypergeometric 
functions of similar type. 

Roy [20, 21] has investigated the noncentral means distribution, for the general 
case of finite error degrees of freedom, and obtained the distribution for the case 
of one nonzero parameter of noncentrality, i.e., with M of rank 1. Bartlett [3] 
showed that the general distribution could be expanded in a power series, of 
which he calculated the coefficients up to 3rd order. Constantine and James [4] 
supplied improved methods of calculating the coefficients and tabulated the co- 
efficients of 4th order; but the series is still far too complicated. 


3. The distributions expressed as multiple integrals. By taking out the factors 
involving M, (1) can be rewritten as 


(5) dF(X;M, =) = etr (—3=°'MM_’) etr (M’="'X) dF(X;0, 2), 
or 
(6) dF(Y;=°M, I,) = etr (—}=°'MM’) etr ((2*M)’'Y) dF(Y; 0, I, ). 
(Note that, although = 'XM’ is a k X k matrix and M’="'X isan X n matrix, 
tr (2 °XM’) = tr (M’="'X)). 

The distributions dF(X;0, 2) and dF(Y; 0, J,), for which M = 0, give rise 
to the central distribution (3) of S and the null distribution (4) of h, --- , i 


respectively. By arguments presented in James [13], one readily sees the following 
THEOREM I. 


1. The noncentral Wishart distribution is the central distribution (3) multiplied 
by the function 


(7) etr (- ; =u’) [ etr (M’S"XH) dH 
4 O(n) 
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where dH stands for the invariant measure on the group O(n) of n X n orthogonal 
matrices H normalized so that { dH = 1. 

2. The noncentral distribution of l, , --- , i, is the central distribution (4) multi- 
plied by the factor 


(8) etr(—F2-arar’) [ / etr ((24M)’H, YH;) dHidHy. 
a O(k) “ O(n) 


(8) is a symmetric function of the latent roots nl, , --- , nl, of YY’ and of the latent 
roots of M=~'M' which are the parameters of noncentrality. 


4. Results from zonal function theory. 


THeoreM II. If W isak X n matrix, k S n, then 


(9) [ (tr (WH))*dH= > xD) 7 ww’), 
O(n) pePUk) Zp(In) 


E 2’f! 
10) (tr(A))? = —— (1)Z,(A), 
(10) tr(A)) fy > xp(1) ) 


(11) [ Z,(AHBH’) dH = 22\A)22(B) 
0(k) Z,(Lx) 
where P(f, k) is the set of partitions p = (fi, fe, --- , fx) of the positive integer 
f into not more than k parts, x,(1) is the dimension of the representa- 
tion [2f; , 2f2, --- , 2f,] of the symmetric group which is given in equation (16), 
or can be found from its character tables; A and B are symmetric k X k matrices 
and Z,(A) is the zonal polynomial corresponding to the partition p. 

That is, Z,(A) is a symmetric homogeneous polynomial in the latent roots of 
A which, under the transformations 


ZA) —Z,(L"AL" ), 


by nonsingular k X k matrices L of the linear group, generates a representation 
space of the irreducible representation {2f, , 2f2,--- , 2f;:} of the linear group. 
Z,(A) has been normalized, i.e., multiplied by a constant, so that equation (10) 
holds. 

Equation (11) was established by James [15]. (9) and (10) will be derived in 
another paper (James [17]). 

If W or A has rank r < k, the zonal polynomials Z, , corresponding to parti- 
tions p of f into more than r parts, vanish (see the remark at the end of Sec- 
tion 8). Since the nonzero latent roots of the k X k matrix W’W are the same as 
those of the n X n matrix WW’, Z,(W’W) = Z,(WW’). 

LEMMA: 


/ (tr (WH))”* dH = 0. 
O(n) 


Proor: Substitute (—J)H for H. Since dH is the invariant measure on O(n), 
d((—I)H) = dH. Then 
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i (tr (WH))*" dH = / (tr (W(— I)H))*" d(—1)H 
O(n) O(n) 


= -| , (te (WH))"™ dH 
O(n 


TuHeroreM III. The function (7), which multiplies the probability density of the 
central Wishart distribution to give the noncentral distribution, is 


etr (- >"um’) / etr (M’="XH) dH 
O(n) 


x 


1 ae / 1 xp(1) —§ tel, gy) 
=etr{—=2 MM ——— — Zr MM'S nS). 
= ( 2 ) Dy (2f)! psbthin Zp(Tn) o( " 


Z,(= ‘MM'="'nS) is to be understood as the zonal function of the latent roots of 
= ‘MM'>"'nS. 


Proor: Expanding the exponential under the integral sign in (12) in a power 
series, we have 


/ etr (M’="XH) dH = >> . / (tr (M’="XH))° dH. 

O(n) o=0 J! Jo(n) 

The integrals of the odd powers, g = 2f + 1, vanish, and those of the even 
powers, g = 2f, are given by (9) with W = M’="X, an n X n matrix of rank 
<k. Hence 


o ] xp(1) —1 —1 
= Po Z (M's XX'= MM). 
fi 2 (2f)! peP(f,k) Zy(In) ol 
Since XX’ = nS, and the nonzero latent roots of M’/=*XX’="'M agree with 


those of = ‘MM’="' XX’, (12) follows. 


5. Noncentral means distribution for known covariance. Using equations (9) 


and (11) of Theorem II to evaluate the integral in (8), we have, putting W = 
4My\W 
(> °M)’A1Y, 


etr (- i2'MM’) [ dH, etr > 'M)’H, YH.) dH, 
2 0 (k) O(r) 


= etr(—}2"Ma’) 3) > lp) 


f=0 (2f)! peP(f.k) Z, (In) 


Z,(2*MM’='H, YY’H;) dH, 


0(k) 
al = J 
=etr{—~=2 MM’ a 
P r( 2 ) dy (2f)! pero.) 


xp(1) oe 
(EMI % 
Zy(1n)Z p(k) Z,( { { )Z, (nl ; , nk), 


where Z,(nl,,---,nk) = Z,(YY’). Hence 
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TuHEoreM IV. The distribution of the latent roots of the determinantal equation 
LS ae Z| = 0, 


where S = n ‘XX’ and X is distributed as in (1) with M # 0, is the density (4) 
multiplied by the function given on the right hand side of equation (13). 

The zonal polynomials up to 4th order are tabulated in James [15], and those 
of 5th order are appended. 


6. Calculation of the zonal polynomials. The coefficients of the zonal poly- 
nomials can be calculated by considering a certain representation of the sym- 
metric group. The mathematical theory, upon which this calculation is based, is 
rather extensive and will be postponed to another paper (James [17]). 

Consider the set D of distributions corresponding to the partition (2’) of the 
integer 2f into f parts each of size 2. Such a distribution, or doublet as we shall 
prefer to call it, can be considered as a pairing {1yt2} {igs} --- {%2ys%27} of the 
ordinal numbers 1, 2, 3, --- , 2f, not taking account of the order of the pairs or 
the order of the numbers within a pair, e.g., {4 3} {12} = {12} {34}. There are 
clearly N = (2f)!, (2’f1) doublets in D. 

The symmetric group S2; of permutations of the ordinals 1, 2, --- , 2f, 

1,2,---,2f— ol, 02, --- , o(2f), oé€ Sey, 
induces a transitive group of permutations 
{tyta} {tats} +++ [toy—stey} — { ot 0%2} | o13074} +++ loteysotes} 

of the doublets. Choose a doublet, e.g., {1 2}{3 4} --- {2f — 1 2f}, as origin in D 
and let T be the isotropy subgroup of S2;, i.e., the subgroup which leaves the 
origin fixed. T is clearly of order 2’f!. The isotropy group T' divides D into equiva- 
lence classes or orbits, two doublets being in the same orbit if and only if there 
is an element of 7 which transforms one into the other. (The orbit determines an 
invariant relation in the sense of James [16].) 

The orbits are characterized by partitions (1"', 2”, ---) of f; + 2m”+ --- = 
f. Namely, one compares any doublet of the orbit with the doublet taken as 
origin and counts the lengths of the cycles linking them. For example, in the case 
f = 3, to compare the doublet {13}{24}{56} with the origin {12}{34}{56}, write 
a row of 2f = 6 dots and join them with bars above according to the 


pairings of the doublet {12} {34} {56} and below according to 
pairings of the doublet {13} {24} {56}, as follows 


{12} {34} {56} 


—— — 
. ° + ? . . 
Riensenadiprtnanad | t 


J 


[13} {24} {56} 


The 2f dots are divided into cycles of even length, viz. 4 and 2. Dividing the 
lengths of the cycles by two, we have the partition (21) of f = 3 as the orbit- 
partition which characterizes the orbit to which the doublet { 13} {24}{56} belongs. 
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To the orbit-partition » = (1, 2”, ---) of fis assigned the monomial s;'s3° - - - 
in the sums s;of the ith powers of the latent roots of the argument matrix A of 
the zonal polynomial, 


Z(A) = D> Zyr8t'ss? «+: 


The coefficient z,, in the zonal polynomial Z,(A) corresponding to the partition 
p = (fife ---), is proportional to the sum of the values of the character x,(c) 
of the representation [2f; , 2f2 , ---] of S:; over those elements o ¢ S2; which map 
the origin {12}{34} --- {2f — 12/} into the orbit determined by the partition 
y = (1", 2", ---) of f, 
(14) tp =k DY Xp(o). 

eeTrT 
7 is any element of S.; which maps the origin into the given orbit. The set of all 
elements of S:, which do so is then clearly the double coset TrT of the isotropy 
group 7’. We can choose k to make the coefficient of s{ unity. 

Instead of characters, one may use coefficients of primitive idempotents of the 
symmetric group algebra such as those of Alfred Young. (See Littlewood [18], 
Section 5.4 or Rutherford [22], Chapter II and also James [16]. ) 

We now give an example of the calculation of Z, for the partition p = (fife) = 
(21) of f = 3. Arrange the ordinals 1, 2, --- , 6 in two rows of a Young sym- 
metry diagram corresponding to the partition (2f; , 2f2) = (42). 


1 2 3 4 


5 6 


Let us write a function on the doublets D as a formal linear combination of 
doublets with its values for coefficients. In particular, the function which has 
value unity at the origin {12}{34}{56} and zero on all other doublets will again 
be denoted by the symbol {12}{34}{56}. 

Apply to this function, the Young symmetrizer. This is the element s of the 
group algebra of S2; which is the sum of the permutations within the rows of the 
symmetry diagram. To the resulting function, we apply the alternator a which 
is the linear combination of permutations within the columns of the symmetry 
diagram whose coefficients are +1 according as the permutation is even or odd. 
In the symmetrizer s, it is only necessary to include one element from each coset 
oT’, where T’ is the subgroup of T’ whose elements permute within the rows of 
the symmetry diagram. Likewise, one need only include in the alternator a, one 
element from each coset 7”c, where T” is the subgroup of 7 which permutes 
within the columns. Thus we put 


s = () + (13) + (14) 
a = () — (15) 
and calculate 


s {12}{34}{56} = {12}{34}{56} + {14}{23}{56} + {13}{24}{56} 
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as {12}{34}{56} = {12}{34}{56} + {14}{23}{56} + {13}{24}(56} 
— {16}{34}{25} — {16}{23}{45} — {16}{24}{35}. 


Since these doublets are in orbits determined by the orbit-partitions 


(1°) + (12) + (12) 
— (12) — (3) — (3) 
= (1°) + (12) — 2(3) 
the zonal polynomial for p = (21) is 
Z,(A) = 81 + 818 — 283. 


7. Orthogonality of coefficients of zonal polynomials. The zonal polynomials 
are really idempotents of the tensor representation of the linear group which cor- 
respond to idempotents of the representation of the symmetric group in the 
space of functions on the doublets. Since the coefficients of idempotents belonging 
to inequivalent representations must be orthogonal, it follows that the coefficients 


of the zonal polynomials must satisfy orthogonality relations which can be shown 
to be 


(15) > 2" = 3, 


vy 2h)» ene xp(1)’ 

where p and g are partitions of the integer f and the summation with respect to 
v is taken over all such partitions. x,(1), denoted by c(p) in James [15], is the 
dimension of the representation [2f; , 2f2, --- , 2f.] of Sey , which is known to be 


-?) 


(16) xo(1) = (apy Betis = BD 
Lih!---l,! 

where |; = 2f, +s — l,le = 2fe+s—2,---,l, = 2f,. (f) is the partition of 
f into a single part. The coefficient z,,, in the zonal polynomial of (f) is, in fact, 
the number of doublets in the orbit of v. 

Hua [7, 8] has shown that the zonal polynomial Z,(A) can be expressed as a 
linear combination of characters of the linear group (Schur functions) corre- 
sponding to symmetry diagrams of order <p, 


Z(A) = >. rexe(A), 
@spP 


where g runs over those partitions of f which are not above p. 

In particular, the zonal function corresponding to the lowest partition p = 
(1’) is the fth elementary symmetric function of the latent roots of A, and if all 
but r roots are zero, the zonal polynomials corresponding to partitions p = 
(fi, fo, +++, fra, °°:) of f into more than r parts vanish. 
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DISTRIBUTION OF A DEFINITE QUADRATIC FORM FOR 
NON-CENTRAL NORMAL VARIATES 


By B. K. San anp C. G. Kuarri 
University of Baroda, India 


1. Summary. In this paper we generalize the result of James Pachares [1] on 


the distribution of a definite quadratic form to the case of non-central normal 
variates. 


2. The problem. Suppose we have a quadratic form Q = (})y’Ay where A is 
a p X p symmetric positive semi-definite matrix of rank n (Sp) and y’ = 
(y1,°°*, Yp)- The y,’s are independently distributed normal variates with 
means v; and variance one, i = 1, 2, --- , p. It is well-known that we can make 
an orthogonal transformation reducing Q to its canonical form, i.e., 


n 
Q _ 4> ani ’ 

t=l 
where the coefficients a, , --- , @, are the non-zero latent roots of the matrix A, 
(all a,’s positive). Under such a transformation 2; , --- , x, remain independent 
normal variates with means yu; and variance one; (yu; is obtained from the »,’s 
in the same manner as 2; is obtained from the y,’s.) The problem is to find 
F(t) = Pr(Q St). 


3. The solution. 


Turorem. Let Q = a> ens ax. , where the x; are independent N(y;, 1) variates 
and where a; > 0 fori = 1, 2,---,n. Let 


—l, 2 ‘ _~ 
Q* = 1) a; (2i — wi), L* = 7: (2a;) Ma, — Bi) Mi ? 
t=1 


t=] 


and D = ay + dg-+++ Qn. 


n 20 jey2k p5+k Sy ot 
a) F(t) = D* exp (-1 p 3) e Saks = RG? L* x , 
ial femo 712k) TG + k + 1 + 4n) 

b) the series in (a) is absolutely convergent, 
c) for any two non-negative integers r and s and every t > 0, 

28 2r+1 

S:. = 2d 4; > F(t) > 2. d; = Susi, 
j= i= 


where 


exp (-3 > ‘) @-ti(—-1)' . 


t=] 


9**#* B(Q*’L*"") 
Dit HSH K)ITGn + 54+k4+ 1) 
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Proor. Let R represent the region where }>>7_, av; < t. Then, with dx = 
dix, diy dry --+ dXn, 


F(t) = (an) [ eo fow-4 +2 fey a eae ax 
R t=] 
= (Qr)~™ exp (—2} 4 Dat) f fox (-3; 2d ri + 7 x; ns) ox 
t=1 


Expanding the exponential in the integrand, we get 


F(t) -c/ ss / > a (4d xi) | > (1k!) (3 2m) dx, 


j=0 = 


(1) 


i 9 7 
where C = (27) "exp (—3>>2.4 uj), ie., 


2 n k 
2) F Cc 1y2/( 51 st ae, 
(2) F(t) 2, l(- 4)?/(71k!)] If -. [2 at) (Ze u:) x 


F(t) =C DY [(-))/Gitk)] DY ee 
j k=O 


r’s 7’s W\.- -*tn!m!:-- 


Lethe 


where >>, and >, denote respectively the sums taken over all non-nega- 
tive integral ;’s and 7,’s subject to the respective conditions 


(3) 


™ + me +--+ tm = J, m + m2 + +++ m = Kk. 


Now by a well-known formula of Dirichlet it is easy to show that the right- 
most n-tuple integral in (3) is 0, if any 7; is odd, 


(24) OO" P[S(2m + om + 1))---TEB(2ee + om + 1) 
D'T[s(k +n) +7 + tain... gre 


if all n, are even, and so we need consider only even values of >> 7, = k. Hence 


n j n 2k+1 
(5) [ tee te 2) (> x; «) dx = 0, 
. R t=] t=1 


i n 2k “We (j+k+n 
| _ j(2k)1(2t) 
i |. {(z ‘') (> u) a= DirG@n+j7+k+4+1) 





—e “Hn T [4 (2a + 2m + 1)]---TLR(2en + 20, + 1)] 


! rite rn 1n 
)la °° °@, 


r’s 7's m,!° -+m,!(2m)!-- . (2n, 


where >>, means summation over all non-negative integral m --- m, such 
that > -7.1 1 = k. The problem is to evaluate this last expression. Recalling 
that, if x; is N(u;, 1), then E{4(a; — y,)*}" = T(r + 4)/T(3), we find 
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E(Q*L*" 
n j n 2k 
( t=1 f l t=] f 
SHH AOD A AEC — wey = 
7's 9's W!:++H,! (2m)!- ae (2n,,) !az?*™ i -arnt™ 


Substituting the value of E(x; — y;)*"***", we find that equation (6) is equiva- 
lent to 


SPN eNes me rar “ips 
[-: [(e2) (ED) dx DTG@n+j+k +p 2 L*). 


Hence equation (2) gives 


%0 Jep2k pik ai 7 %2k 
(9) F(t) = exp (~ +> si) ¥ —1)'2"' ESL") 
i= jkmo J! oe ID'T(Gn + J + k + Dd’ 
This proves part (a) of the theorem. 
To show absolute convergence for part (b) we note that, if a = min a; and 


n 2, 
Yo Diet Hi/G; , 





Q* s _ (2; = ial u;)’/(2a), 
t=1 
L* <} 


(yi #2 k rsGG +k+1+n)| 
E(Q*’L*") < » aran + 1) 


Hence, if F*(t) is the sum of the absolute values of the terms of the series for 
F(t), then 


exp (-1 7 u) texp(t/a) cosh (2r't') 


t=] 


(11) F'(j< 





< « fort < ~. 
DT (4n + 1) 
This proves part (b) of the theorem. 
The bounds of part (c) are based on the fact that, if r and s are any two non- 
negative integers, then, for ¢ positive, 


EGP oe BS" 


Replacing t by > 2 , xz; and using (1), (2) and (8), part (c) of the theorem 
can be established. 


REMARKS. 
) The joint moments of Q* and L* are easy to obtain from the joint cumu- 
lants of Q* and L*. If K;,; is the (7, 7) cumulant of (Q*, L*) then 
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Ko = (r — 1)!32, 4," for r = 1 


t=1 


n 
> 2/6 +1 
K,2 r!>> pi/(2a;"-) for r 


t=] 


K,,,=0 for j = 1,3,4,5,-:: and r 


? 
(ii) Clearly So, S2, Sq, +--+ is a sequence of upper bounds and §,, 
- is a sequence of lower bounds for F(t). In practice we would compute a 
finite number of terms and then state that min S., > F(t) > max So,-4,. The 
absolute value of the error thus committed is not greater than 


min S2, — max Se,4;. 


4. Applications to the distribution of a sum of squares in dependent variates. 

(a) Let yw, Ye-+: Yn have a joint multivariate normal distribution with 
means »,-°-*, ¥, and variance-covariance matrix A. We wish to find the 
distribution of R = }y’y. Now 


Pr(R < t) = |A\(20)* [eo] exp [-}(y — v)’A'(y — »)] ay 
y'yst 


= (Qe) ™ [- ; ‘| exp [—4(x — p)'(x — p)) dx, 
x" Axs 


where Px = y and Py = v, P* = A(P is a symmetric matrix). We can now 
directly apply the theorem. 

Remarks. Combining the results of the theorem and the above application, 
it is easy to show that we could find the distribution of a definite (or semi- 
definite) quadratic form y’By, which involves as parameters the latent roots of 
AB, and with non-central parameters depending on the given means, the vari- 
ance-covariance matrix A and the latent roots of AB. 


(b) The complex normal distribution defined by Wooding [3] and Turin [2] 
has density function given by 


x "\L\" exp [—(v — v)*L“(v — v)] 
where v = z + iw and E(v) = v, x* is the complex conjugate of x, 
E{z — E(z)\{z — E(z)|' = ly, 
E(w — E(w)][z — E(z)!’ = —Elz — E(z)\[w — E(w)]' = Ll 


and L = L, + ily is a hermitian positive definite matrix. 
Then for the distribution of v*’0 = R, we have 


Pr(R St) = 4" |L\” [- . .f exp [— (v — v)*L"‘(v — v)] dv 
Rst 


where dv = dz, dzo dz; --- dz, dw, dw, --- dw, = dzdw. This is equivalent to 
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Pr(R St) = 9" |L|- f.. [. | - D a5" { (ay — wy)? + (a5 — pu) |x 


j=l 


where dx = dx, dx, , the a;’s (a; > 0) are the latent roots of L and 


me n n 
vv = > tut pe Mij + > fmt Me Maj, 


or 


Pr(R St) =7 "-- -fexp| -) > { (a1; — my /V as)” + (a5 — ps/Va) | dx 


where x; = 21; + 23; . This can be easily solved by the direct use of the theorem. 

ReMArkKs. Combining the results of the theorem and the above application, 
it is easy to show that we could find the distribution of a definite (or semi- 
definite) Hermitian form v*’Av (A positive definite or semi-definite), which 
involves as parameters the latent roots of AL and with non-central parameters 


depending on the given means, the variance-covariance matrix L, and the latent 
roots of AL. 
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PERCENTAGE POINTS AND MODES OF ORDER STATISTICS 
FROM THE NORMAL DISTRIBUTION 


By SHanti 8. Gupta 


Bell Telephone Laboratories 


0. Summary. This paper deals with the order statistics from the normal dis- 
tribution. Equations are obtained for the percentage points and the modal 
values of the kth order statistic in a sample of size n. Table I gives some per- 
centage points for selected values of k and n. In Table II the modal values of the 
largest order statistic are given. Appropriate symmetry relations which enable 
one to obtain certain missing values in Tables I and II are mentioned. 


1. Introduction. Let 2, , t2, --- , %, be m independent observations from a 
normal distribution with probability density function 


} 


g(x) = (24) exp (—2//2) 


and cumulative distribution function ®(z). Suppose the observations are ar- 
ranged in order of increasing magnitude so that we have 


(1) ©) S U2) ~ coe S Xk) 0:6 ie Lin) 3 


then we shall denote the kth order statistic by xq). In this paper we use the 
same symbol for both a random variable and an observation on it. 


2. Percentage points of the order statistics. The probability density function 
(p.d.f.) of y = 2q@) is given by 


" ' 


n! 
(2) fly) = (k — 1)!(n — k)! 


@*"(y) [1 — &(y)]"“ely), 


and its cumulative distribution function (c.d.f.) F(y) is 


‘ 7 aaa Be nk (a), 
(3) FW) = gap | 8) Ot - Heo) a, 


which can be written as 
(4 F(y) = Iam(k,n —k + 1), 


where J,(p, q) denotes the ratio of the incomplete Beta function to the com- 
plete Beta function with arguments p and q and which is tabulated in [4]. Thus 
the a percentage point of the kth order statistic is given by 


(5) lew (k,n —k +1) =a. 


The percentage points of the Beta distribution with parameters p and q are 
tabulated in [3] for selected values of p and gq respectively. 
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The following symmetry relation enables one to obtain other values of the 
percentage points of the Beta distribution 


(6) I,(p,q) = 1 — hz(q, p). 


Other missing values can be obtained by inverse interpolation in the Table of 
the Incomplete Beta Function [1]. 

It should be noted that, in the particular cases when k = 1 and k = n, equa- 
tion (4) reduces to simple forms, and hence the percentage point of the smallest 
order statistic is the solution of 


(7) &(y) =1-—(1— a)" 

and the percentage point of the largest order statistic is the solution of 
(8) &(y) = a”. 

By putting y = —z and a = 1 —y in (5) and using (6) we obtain 
(9) Toa(n —k+1,k) = 7, 


which implies that the a percentage point of the kth order statistic in a sample 
of size n is the negative of the 1 — a@ percentage point of the (n — k + 1)th 
order statistic, and vice-versa. 

Using (7), (8) and (5), the upper percentage points of these order statistics 
were computed for nm = 1(1)10, k = 1(1)n and @ = .50, .75, .90, .95 and .99. 
Also, the same upper percentage points for nm = 11(1)20 for the smallest, largest 
and median order statistics were computed. All these values are given in Table I. 

It should be pointed out that (1)-(8) are generally applicable to any con- 
tinuous distributions when #(y) and ¢(y) are replaced appropriately; for ex- 
ample, these equations are used to obtain the percentage points of the order 
statistics from the gamma distributions in [2]. 


3. Modal values of the order statistics. The mode or modal value z of the kth 
order statistic satisfies the equation 
be irene 
(k — 1)!(n — k)! dx 
which simplifies to 
(11) (k —1){1l — ®(2)]e(2) — (n — k)®(xz)e(z) = O(2)[1 — B(2)]e. 
The equation (11) remains unchanged if we substitute —z for z andn — k + 1 
for k. This shows that the modal values of the kth and (n — k + 1)th order 
statistics are equal and opposite in sign. In particular, the modal value of the 
median i.e., the (m + 1)th order statistic in a sample of size n = 2m + 1 is 
equal to zero. Also, from the above symmetrical relation, it follows that for any 


order statistic which is above the median the modal value is a positive number. 
For the particular case k = n, the equation (11) reduces to 


(12) (n — l)g(x) = P(x), 


(10) 


{o*"(x)[1 — &(x)]"“e(x)} = 0, 
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Percentage Points of the kth Order Statistic ina 
Sample of Size n from the Normal Distribution 
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This table gives the values of y for which 
n-k 
k-1 
® (x) E (x) p(x) dk = a, 


- © 


where @ (x) and ® (x) refer to the p.d.f. and c.d.f. of a standard 
normal chance variable, respectively. 
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TABLE II 


Modal Values of the Largest Order Statistic in a Sample 
of Size n from the Standard Normal Distribution 
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This table gives the value of x for which 


(n-1) 9 (x) = x © (x) 


where 9 (x) and © (x) refer to the p.d.f. and the c.d.f. 
of the standard normal chance variable, respectively. 


which was solved for n = 1(1) 25(5) 50(10) 100 to give the values of the mode 
of the largest order statistic in a sample of size n from the normal distribution. 
These values are given in Table II. 

It should be noted that with obvious changes the above text, up to and includ- 
ing (12), can be applied to any continuous variable distributed symmetrically 
about zero. 


4. Description of the tables. Table I gives the a percentage points of the kth 
order statistic in a sample of size n from a standard normal distribution. Values 
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of a chosen are .50, .75, .90, .95, and .99, which correspond to the upper 50, 25, 
10, 5 and 1 per cent points of the distribution. The percentage points correspond- 
ing toa = .25, .10, .05 and .01, viz., the lower percentage points of the kth order 
statistic, can be obtained from Table I by changing the sign of the corresponding 
upper percentage point of the (n — k + 1)th order statistic. For values of 
n = 1(1)10, percentage points are given for all k = 1(1)n. For values of n = 
11(1)20, the percentage points are given only for k = 1, 4(n + 1), n for odd 
values of n and for k = 1, 3n, 4n + 1, n for even values of n. 

The values given were computed by using Newton’s method on an IBM 650. 
The values of the percentage points of the median then were checked against the 
available values in [1] and were found to be in agreement. Other independent 
checks indicate that the percentage points are correct to within one unit in the 
last. decimal place. 

Table II gives the values of the modes of the largest order statistic for n = 
1(1) 25(5) 50(10) 100. The modal values of the smallest order statistic for a 
given n are obtained from this table by changing the sign of the corresponding 
value. These values were also obtained by using Newton’s procedure and are 
correct to four decimal places. 


5. Acknowledgments. The author wishes to thank Miss Phyllis A. Groll for 
assistance in computing Tables I and II and Mr. H. Leon Harter for helpful 
comments, including corrections to the entries in Table IT. 
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NOTES 


EXPRESSING A RANDOM VARIABLE IN TERMS OF UNIFORM 
RANDOM VARIABLES 


By G. MARSAGLIA 


Boeing Scientific Research Laboratories 


Summary. This note suggests that expressing a distribution function as a mix- 
ture of suitably chosen distribution functions leads to improved methods for gen- 
erating random variables in a computer. The idea is to choose a distribution func- 
tion which is close to the original and use it most of the time, applying the 
correction only infrequently. Mixtures allow this to be done in probability terms 
rather than in the more elaborate ways of conventional numerical analysis, 
which must be applied every time. 


Introduction. We are concerned with procedures for generating sequences of 
numbers which will serve as independent determinations of a random variable 
x with specified distribution function ¥. Currently, the most satisfactory method 
is to use an arithmetic procedure for genérating a sequence of numbers uw , U2, -°- 
which serve as “independent” determinations of a uniform (0, 1) random vari- 
able, and then to generate the required x in terms of the u’s. There is no, or very 
little, probability theory concerned with the u’s—they are generated recursively, 
say by putting wii: = au; + 8 (mod m), where a, 8 and m are chosen to make 
the resulting sequence meet the user’s requirements for “randomness.” See 
references [1] and [2]. 

If we are willing to grant the adequacy of such procedures and take as our 
starting point a sequence 


U1, U2, Us; °°° 


of independent uniform (0, 1) random variables, then we may se some prob- 
ability theory in searching for methods for expressing a random variable x with 
distribution function F in terms of the w’s, guided, of course, by the suitability 
of such methods for use in programs for digital computers. A summary of existing 
methods for generating a normal random variable is given by Muller in [3]. We 
will not go into details of the various methods on record, but point out that the 
fastest method in Muller’s summary is one of his own [4] that takes, using a unit 
familiar to programmers, about 120 cycles and provides F to within a certain 
accuracy, while programs based on the methods outlined below will be more 
accurate and have average running times on the order of 15-20 cycles. 


Methods. Suppose we have a method M, for providing a random variable y; 
with distribution function G, , and method M, takes 10 cycles. Suppose we also 
have a method M; for providing a random variable y2 with distribution function 
G, , and method M, takes 500 cycles. If we can represent F as a mixture of G; 
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and G, , say 
F(a) = .99G,(a) + .01G,(a), 


then we may use u , U2, Us, --- to provide a random variable x whose distribu- 
tion is F as follows: if wu < .99, we use method M, on uw, u3, --+ to generate 
y, and put z = y, . If 99 S w S 1, we use method M; on w, u;, «++ to generate 
y2 and put x = y.. The average running time will then be (.99)10 + (.01)500 = 
14.9 cycles, plus the time necessary to test uw, < .99. 

This illustrates the basic principle of the simple device which provides pro- 
grams with very short average running times—we represent F as a mixture of 
distributions, 


(1) F(a) = iG,(a) + poG2(a), 


in such a way that p, is close to 1 and the time to generate a random variable 
y;, with distribution G, is smali; then most of the time we put x = y,. Even 
though G, , the correcting distribution, may be quite complicated and difficult 
to handle, we still have a short average running time, since G. must be han- 
dled so infrequently. 

In searching for representations of F such as (1), if we have a G,; which shows 
promise, then we find the largest p, so that F — p,G, is monotone. It seems bet- 
ter to work with densities, and find p,; so that f(z) — p.gi(x) is non-negative. 
Furthermore, if we make g; a rectangle or a mixture of rectangles, we can expect 
to have very short running times, especially if the rectangles are chosen so as to 
exploit the particular features of the computer in question. 

We give some examples which will serve to illustrate the above remarks. 

EXAMPLE 1. Let 


(9—2z2)4 


h(x) = oo | e™ dy, 0s27 3s3. 
0 


We will see in Example 2 how h may be used in generating a normal random 
variable. Suppose we want to express a random variable z with density A in terms 
of the members of the sequence 1 , Ue , Us, -** . We write 


hie) w coos ida) +: somal + sae or h(a), 


2048 ois 
where the terms in the mixture are drawn in Figure 1, A; and h2 are mixtures of 
rectangles and h; is the residuum. Then: 

(i) The rectangles which make up h; have base } and altitude an integral 
multiple of #5. A random variable z, with density h; may be formed by putting 
2, = a; + } w where a; is chosen with probability 3}. z: may be formed in about 
10 cycles. 

(ii) The rectangles in hz all have the same area, so that a random variable 2, 
with density he may be formed by putting z2 = c; + byw , where the pairs (c; , b;) 
are chosen with equal probability. Time, about 20 cycles. 
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(iii) It is more difficult to generate z; with density A; . Most of the ‘“‘teeth’’ in 
h; may be replaced by triangles, so that 2; = e; + d; min(t:, us) will do, where 
the pair (e; , d;) is chosen with a certain probability. The probabilities associated 
with these pairs are all different so that the computer must spend some time find- 
ing the appropriate one, and a few other parts of h; must be handled by even 
slower methods, but the entire procedure shouldn’t take more than 200 cycles. 

Thus, to generate a random variable z with density h, we use wu; to choose one 
of the above methods, generate a numper, and call it z. The average time is 
around 15 cycles. 

EXAMPLE 2. The normal distribution. We describe a method for generating a 
random variable x with the absolute normal density: 


, ao ~ 
f(x) ~ nyt ° ’ 0 =. 
A random + may be attached later. The tail of f offers some difficulties which we 
avoid by using a suggestion of D. MacLaren. If x and y are independent with 
density f then the distribution of x, given that 2° + y’ S 9, is h in (2). Hence 
we put p = 1 — e *” & .989 and write 

f(x) = ph(x) + (1 — p)t(2), 


where 


1 : 


To generate a random variable x with density f in terms of um, uw, ---, we 
test: is wu < p? Then: 

(j) If wm < p, use the method of Example 1 on w:, us, --- to generate z. 

(jj) If p S wm < 1, put 


9 + 2p\' 
(3) t= te (24 *) : 
Us + U3 


where p has the exponential distribution and us + us = 1. The right side of (3) 
has density t. We may use any of several methods for generating p. The time 
necessary to generate x in this way is relatively long, but we can afford to be 
extravagant since we use this method only 1 percent of the time. 


Remarks. The quick parts of the mixtures above are based on representing 
a random variable in the form a + bu where a and b are discrete random 
variables and u is uniform on the interval (0, 1). It is easy to show that any 
random variable may be so represented, in much the same way as the funda- 
mental result in analysis that a measurable function is the limit of a sequence of 
simple functions. The problem is to choose the discrete distributions of a and b 
in a suitable way to ensure short running times, consistent with the number of 
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storage locations which can be allotted for the program. Only a certain number 
of the probabilities for the values of a and 6 can be stored; if a and b are to have 
an infinite set of values, then the machine must compute the probabilities from 
some point on, or else a and b can be assigned a finite set of values and then the 
residual portion of the distribution can be treated by other means, as in the case 
of h; of Example 1. 

At any rate, there is a wide variety of reasonable ways of assigning distribu- 
tions to a and b. The one chosen in Example 1 requires a moderate amount of 
storage, less than 1000 locations, and is quite fast. It can be made even faster by 
increasing the number of rectangles in fh; or he, at the expense of additional 
storage space. An assignment of distributions for a and b which requires less 
storage space than that suggested above is suggested by the rectangles in Figure 2. 
The idea there is to let x = u, u + 1, 4u, 4u + 1, u + 2, --- , with the greatest 
frequencies possible. A program based on that resolution of h generates a random 
variable x with density h by putting « = u about 48 percent of the time, x = 
u + 1 about 11 percent of the time, x = u/2 about 11 percent of the time, and 
so on. 
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GENERATING EXPONENTIAL RANDOM VARIABLES 


By G. MARSAGLIA 


Boeing Scientific Research Laboratories 


Introduction. The use of exponential random variables is mentioned in numer- 
ous papers on Monte Carlo techniques, in connection with particle or radiation 
studies, reliability, life testing, etc., but there seems to be little discussion of fast 
techniques for their generation. Von Neumann discussed a method in [1], where 
he remarked that in spite of the method’s appeal, in that it produced the desired 
result by performing only discriminations on the relative magnitude of numbers 
in (0, 1), it was a sad fact of life that it was slightly quicker to use a power series 
expansion to compute the logarithm of a uniform random variable. 

We offer here a simpler device for producing exponential random variables by 
performing discriminations on the relative magnitudes of uniform (0, 1) random 
variables. The method is easy to understand, easy to program, requires little 
storage, and is quite fast, although not quite as fast as one of the versions of the 
general method given in [2]. 

The idea is to choose the minimum of a random number of uniform random 
variables, then add a random integer—roughly, let n and m be random integers 
taking values according to this schedule: 


Value of n probability | Value of m probability 


.58 .58 .63 
.29 d .23 


10 : .09 
-02 . .03 





Then, if u, uw, --- is a sequence of independent uniform random variables on 
(0, 1), the random variable 


(m + min (uw, Ue, ***, Un) 


has the exponential distribution. The expected value of n is about 1.58, so that 
we need on the average only 1.58 u’s from which the minimum must be selected, 
and 1.58 discriminations to assign a value to n. Assigning a value to m also re- 
quires an average of 1.58 discriminations. 


Details. We are concerned with a method for expressing an exponentially dis- 
tributed random variable x in terms of the members of a sequence 


(1) i, U2, °° 
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of independent random variables, each uniformly distributed over (0, 1). Since 
there are available a wide variety of arithmetic procedures for producing a 
sequence of numbers which the users are willing to view as determinations of the 
sequence (1), we assume the availability of that sequence for our starting point. 

Let n be a random variable taking values 1, 2, 3, --- with probabilities 
Pi, pe, +: If 


y = min(wW, Ue, --*, Un), 


then it is easy to see that the distribution of y is, forO0 < @ < 1, 
(2) Ply < 6) = 1 — p,(1 — 0) — pol — 0) — 
In particular, ifc = 1/(e — 1) = .5819767 --- , and 
ra =, Pe = c/2!, Pp; = ¢/3!, --° 
then 


ply S 6] = ce(1 — &*) 


? 


We express our principal result in this way: 

TuHeoreM. [fc = 1/(e — 1) and if the random variable n takes values 1, 2, 3, -- - 
with probabilities c, c/2!, c/3!, --- and if, independently, the random variable m 
takes values 0, 1, 2, --- with probabilities 1/(ce), 1/(ce’), 1/(ce*), ---, then the 
random variable 


x= m+ min(m, Ww, --- 
has the exponential distribution, 
Pix S= a] = 1 — ee, 0 <a. 


The proof is a simple matter of verification—if a = k + @ where k is a non- 
negative integer and 0 < @ < 1, then 


Pile Sk+@=Ple Sk—-—1)+Plm=k,y s 0 


=1—e¢* + [ce/(ce*™)|(1 — e \min-g?™, 


Remarks. The trick of taking the minimum of a random number of uniform 
random variables may be applied to any density function which can be put in the 
form of (2). Unfortunately, the normal density doesn’t fit this pattern, although 
generalizations of the method, such as taking the minimax of a doubly indexed 
set of u’s seem to be feasible without entailing too much programming detail, 
But these tricks cannot get much more complicated and still stay simpler than 
the methods suggested in [2], which lead to very fast programs, surpassed only 
by colossal table look-ups. 
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A COMBINATORIAL LEMMA FOR COMPLEX NUMBERS'! 


By GLEN BaAxTER 


Aarhus University, Aarhus, Denmark 


Although combinatorial lemmas have been used quite successfully in analyzing 
sums of random variables [1, 2], to the best of our knowledge these considerations 
have been restricted to the case of real numbers and real variables. It is our 
purpose in this note to show by a simple example that combinatorial lemmas 
for complex numbers can also be given and applied to analyzing random walks 
in the plane. 

1. Random walks in the plane. Let {Z,} be a sequence of independent, identi- 
cally distributed complex-valued random variables. Let So = 0, and let S, = 
Zit+---+Z,,n 2 1. We call So, S,,---, S,,--+a random walk in the 
plane. The combinatorial lemmas given below are concerned with the convex 
hull of the random walk. Specifically, every walk Sy), --- , S, (n + 1 points in 
the plane) determines a smallest closed, convex set containing these points. 
The boundary of this set is called the (convex) hull’ of S),---, S,. Later, 
we will be concerned with three properties of the hull of a walk. We list these 
properties in the form of variables for later reference. 


K,: the number of variables Z,,--- , Z, which are line segments in 
the hull of Sy, ---,S 


»> ny 


(1) H,: the number of line segments (sides) in the hull of S,, ---, S,, 


L,: the length of the hull of So, ---,S, . 


2. Combinatorics. Let z, , z. , --- , 2, be a set of n complex numbers and let 
& = at: +H. Ifo: i, ,--- , % is any permutation of 1, 2,--- ,n, we 
let (0) = 2:, + --- + 2, . The notation Z, will represent the sum of the vectors 
in a subset A of 2, --- , Zn while z, will denote the (non-directed) line segment 
corresponding to 2,. We need an important definition which seems to be the 
natural analogue of ‘‘rational independence” for real numbers. 

DerFINniTIONn. Let z:, --- , 2, be complex numbers with partial sums 3, --- , 
8, . We say the vectors 2,,--- , 2, are skew if z, is parallel to zz, only when 
A = B. 

Every vector z in the plane, when extended along its length, determines two 
half-planes which we call the right and left half-planes of z, respectively. We 
include the line itself in both of the half-planes. 

Lemma 1. Let 2, +--+ , 2n be skew vectors with sum z. Then, there exists exactly 
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one cyclic permutation o of 2, +--+ , Zn such that the points so(a), %(c), 
all lie in the left (right) half-plane of z. (See Fig. 1). 





Fig. 1 


Proor. Since 2, --- , 2, are skew vectors, there is at most one point among 
8, °** , Sn—1 in the right half-plane of z which is a maximum distance (possibly 
zero) from the line determined by z. If s is this point (k = 0,1,---,n — 1), 
we take o:k + 1, --- ,n,1, --- , k. The uniqueness of o follows from the unique- 
ness of the index k. Note that among all n! permutations of z,, --- , Z, exactly 
(n — 1)! are such that so(0), --- , 8.(o) lie in the left-half plane of z. 

Let 2, °-* , 2n be a fixed set of skew vectors. Every permutation o determines 
a “path” so(o), s(0),---, 8:(0). Since each line segment of the hull of this 
path connects two points of the path, each line segment of the hull is a sum of a 
subset of the vectors z;,---, zn. Moreover, this subset uniquely determines 
the line segment. The next lemma tells us how often a particular segment is 
likely to appear in the hull of a path. To avoid having to adopt a convention for 
degenerate polygons when n = 1, we will assume that n = 2 from now on. 

Lemma 2. Let 2, --+- , 2n be fixed skew vectors and let A be a fixed subset of m 
of these vectors. Then, the line segment z, appears in the hull of exactly 


2(m — 1)!(n — m)! 


of the n! paths 8o(0), --- , 8:.(0) as o ranges over all permutations. 

Proor. Let zn41 = —s, and let A’ denote the complement of A in (2, ---, 
Zn, 2n41). We call so(o) , --- , 8n(0), Sn4i(o) = 8o(c¢) the completed path asso- 
ciated with z;,,--- , 2:;,. In order that z, (or equivalently z4’) appears in the 
hull of so(o), --- , 8a(o), it is necessary that 24 = 8,4m(o¢) — s&(o) for some k. 
We can thus think of any completed path so(c) , --+ , 8n4:(¢) whose hull con- 
tains z, as subdivided naturally into two ordered sets of vectors, (zi,,,,°°° 


> 


Zingm) ANG (Zig, cs. °° * » Zin» Sngty *** y 2y)- The paths corresponding to each 
of these ordered sets of vectors must lie in the same half-plane of 2, . Moreover 
any ordering of the vectors in A and A’ subject to the condition that their 
paths lie in the same half-plane of 2, gives rise to a completed path so(c), --- 
8n41(¢), the origin (and hence the value of k) being determined from the position 
of 2,4: in the ordering of A’. Thus, we need only to count how many different 
pairs of orderings of A and A’ there are such that both subpaths lie in the same 


, 
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half-plane of z,. From Lemma 1 we find that there are (m — 1)\(n — m)! 
ways of ordering A and A’ so that the subpaths both lie in the left half-plane of 
z,. Taking into account also the right half-plane of 2, the proof is completed. 


3. Application to random walks in the plane. In the applications Z, = X, + 
iY,;, where X; and Y;, are real-valued random variables with a joint density 
function. This implies that, with probability one, Z; , --- , Z, are skew vectors. 
If o: 4, °-:* , %, 18 & permutation of 1, --- , n, then K,(c), H,(c), and L,(c) 
are defined as in (1) in terms of the sums So(c), --- , S,(o) of the permuted 
vectors Z;,, °°: , Z; 

EXAMPLE 1. 

Expectation of K,, . By the identical distribution property, E{K,} = E{K,(o)} 
for any permutation o. Thus, 


(2) n'E{K,} = E{>> K,(o)}. 


(¢) 


2° 


For any skew vector values of Z,, --- , Z, , the summation on the right in (2) 
equals the total number of times than any of the n possible one point sets A = 
{Z,} determines a segment Z, in the hull of So(c), --- , S,(o) as o ranges over 
all permutations. This means 


n 


> K,(c) =2 7 (n — 1)! = 2nl. 
a m=] 

Thus, we expect to find exactly 2 of the vectors Z, , --- , Z, as line segments in 
the convex hull of S),--- , S, . We note in passing that (3) is a universal rela- 
tion, valid for any values of the skew vectors. 

EXAMPLE 2. 

Expectation of H,,. Once again we have E{H,} = E|H,(o)} for every permu- 
tation o. Thus, 


(4) niE{H,} = El Doe) Ha(o)}. 


For skew vector values of Z,,---, Z, the summation on the right in (4) is 
equal to the total number of lines in the n! hulls of the paths So(c), --- , S,(c) 
as o ranges over all permutations. Equivalently, from Lemma 2 


> H.(c) = >> 2(m — 1)'(n — m)! 


e ; “e 
22. (m — 1)!\(n — m):(”) 


m=1 


2n! > 1/m. 


m=1 
Finally, 


n 


(6) E\H,} =2 >> 1/m = 2logn. 


m=1 
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Once again we note that (5) is a universal relation valid for any sequence of 
skew vectors. 

EXAMPLE 3. 

. i = =e 21\3 ° 

Expectation of L, . (Spitzer and Widom [3])°. It is easy to see that 
(7) niE{ Ln} = El Doe) La(o)}. 


By an argument similar to that leading to (5), we find 


(8) 2 is Ae) = doa 2(m — 1)'(n — m)!\Z,). 


Thus, 


> 2(m — 1)'(n — m)IEY\Za)}/n! 
A 


n 

n a 
> 2(m — 1)\(n — m)! E\\S,,|}/n! 
m=1 m 


n 


> E{|Sm|} /m. 
m=1 
REFERENCES 
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Soc., Vol. 12 (1961), pp. 506-509. 


3 By a limiting argument which we could also employ in this example Spitzer and Widom 
remove the condition that Z; = X;, + 7Y; have a density. 
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A COMBINATORIAL DERIVATION OF THE DISTRIBUTION OF THE 
TRUNCATED POISSON SUFFICIENT STATISTIC! 


By T. CacouLos 


Columbia University 
Let X,,---, X, be independently distributed with the Poisson distribution 


truncated away from zero, i.e., 


(1) P(2) = 
oo! 


Tate and Goen showed [2] that T = 2 ta X, has the distribution 
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TRUNCATED POISSON SUFFICIENT STATISTIC 
A‘n! 

2) h(t) = PriT = = ———_— 4 
( (t) r| ] ( — 1)*t! t 
where @; denotes the Stirling number of the second kind defined by 

n 1 ~ ( n—k [1 t 
a ol k - ba 
(3) t ai oe § ) (") ’ t n,n + 1, ? 
; = 0, t<n. 


? 


Their proof was based on characteristic functions, but a much simpler ap- 
proach is available as follows: 
We have 


h(t) = Pr| Xx, -1|- > PriX: =2,---,Xn = Zal, 
i=l (21,7 > stn) 

where the summation is over all ordered n-tuples (x2, , --- , 2.) of integers such 
that z; = 1 and >-2., 2; = t. Hence, by (1), we get 


—nhr, t 
is ee ie ge 


(4) (245° * +12) (_1 — ony? * _ e)*t! (24,++ +52 


IT z:! 


t=1 t=1 


where the summation must be explained as above. We observe however, that 


“/TI x;! 


is the number of partitions of a population of ¢ elements into an ordered n-tuple 
of subpopulations of size 2 , --- , 2, , respectively. Therefore, we conclude that 
(5) a t!/T] 2! 

(245° * sn) t=1 
equals the number of possible ways in which ¢ (distinguishable) balls can be 
placed in n cells (x; being the number of balls in the 7-th cell) so that no cell re- 
mains empty. Hence, we find that (5) (see, for example, p. 92 of [1]) is equal to 


> (= (F) ee 


Therefore, and by virtue of (4) and (3), (2) follows. 
REFERENCES 
[1] Wituram Fe.ier, An Introduction to Probability and Its Applications, Vol. 1, 2nd ed., 
John Wiley and Sons, New York, 1957. 


[2] R. F. Tare anp R. L. Gorn, ‘Minimum variance unbiased estimation for the trun- 
cated Poisson distribution,’’ Ann. Math. Stat., Vol. 29 (1958), pp. 755-765. 





B. BRAINERD AND T. V. NARAYANA 


A NOTE ON SIMPLE BINOMIAL SAMPLING PLANS 
By B. BraInerp AND T. V. NARAYANA! 


University of Toronto and University of Alberta 


Introduction. This note gives two equivalent characterizations of simple 
sampling plans (s.p.’s) of size n, both of which prove the following THEoREM: 


The number of simple sampling plans of size n is n (, ~ :) The definitions and 


notations used will be those of M. H. DeGroot [1]. 

PrRooF OF THE THEOREM. We indicate only the main steps in the proof, 
as the details are straightforward and can be filled in by reference to [1]. 

1. A simple s.p. of size is characterized by the set C of its continuation 
points in the lattice quadrant. 

2. Aset C of lattice points in the quadrant is the set of continuation points of 
a simple s.p. S of size n if and only if 

(i) the intersection C, of C with each diagonal 


A,={z+y=k;x720,y 2 90} 
is connected. 

(ii) C, is non-empty if and only if k < n. 

(iii) No point of C.4; is to the left of the leftmost point of C, or below the 
lowest point of C, . (If A, B are any two points in the lattice plane, A is to the 
left of B if and only if the z coordinate of A is less than that of B and A is below 
B if and only if the y coordinate of A is less than that of B). 

3. Each non-empty C; is characterized by how far southeast, ¢, , its top is 
from (0, k) and how far northwest, b, , its bottom is from (k, 0). t, , bj are non- 
negative integers. 

4. The only restrictions on {t, , b,} of a simple s.p. of size n are 

tt+t+h sk k=0,1,---,n-—1, 
Ost S teu, 0 Ss by S Deas, k=0,1,---,n— 2. 

5. The number of different solutions of the above set of inequalities is the 
number of different simple s.p.’s of size n. 

The combinatorial problem posed in 4, 5 may be solved thus. (A more general 
treatment of such problems is contained in [2].) 


If (x, y), denotes the number of simple s.p.’s of size n with t,.1 = x and 
ba. = y, then plainly 


z sy 
(1) (2, y)n > >d (a, b) n—-1 forz+y<n 


a= b=0 


= Q forz+y>n. 


Received July 28, 1959; revised February 3, 1961. 


1 This note was prepared while the authors were fellows at the Summer Research Insti- 
tute of the Canadian Mathematical Congress. 





SIMPLE BINOMIAL SAMPLING PLANS 907 


The condition (0, 0); = 1 together with (1) determines (2, y), recursively for 
all non-negative integers z, y and positive integers n. 
The number of different simple s.p.’s with tn; + ba: = k is ky , where 


~ iaye= %&} EDL wd) 


r+y=k t+y=k a=0 b=0 


7 
7 (k—c+1)em+y for k <n, kay = Ofork = n, 


c=0 


and 0g) = 1. These conditions determine k,,) recursively. Experiment leads to 
the conjectured solution 


k ~ Ba (mt) = RoR nth 
-  On+k 2n — In+k k 


es mR) 3 eae 
k k-—1 
= 0 


Recalling the simple general formula 


“.fa+b a+c+1 
o ls pet Se): 


it is straightforward to verify that (3) does determine k,,). Then (3) and (4) 
together show that the total number of simple s.p.’s of size n is 


" <= 3n 3n — 1 1f/ 3n 
(5) Tiw = (7, the ~ n\n—1)’ 


and the problem is solved. 


Characterization by boundary points. Let us define a symmetric s.p. of size 
n as one symmetric about the line z = y. From the above, a simple symmetric 
s.p. of size 2n is characterized by the vector of non-negative integers 
(6) (to, ts, o°* 5 bens) (t = to = Q). 
where % S t; S +--+ S ton; and t; S [i/2],7 = 1,---, mn — 1. Consider the 
vectors of non-negative integers (a,,-°--, @n-1) satisfying 
(7) aq 2°°'24,. 2 0, a; S 2n — 21, a I,-s+,2a— I. 


From a vector (@,, --* , @n-,) satisfying (6) we obtain a vector (ft, «~~ , ten-1) 


satisfying (7) (and conversely) by the following 1:1 correspondence: 
Given (a;,-*-, @n-1) construct a vector (t2, ++: , tens) in which 


The first (2n — 2 — a,)t’s are zero. 
The next (a; — a»)t’s are one. 


The next (a@,-2 — @,_,)t’s are n — 2. 
The last a,-, t’s aren — 1. 
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Thus the simple symmetric s.p.’s of size 2n can also be characterized by the 
vectors (a, ,--* , Gn.) satisfying (7); or more precisely the vectors (a, --- , 
an-1, 0, 0, 0, ant, --* @) satisfying (7) characterize simple symmetric s.p.’s 
of size 2n. The a’s in fact represent the ‘‘distances”’ of its boundary points from 
the points on the line x + y = 2n. From known results [3. p. 170], the number 


3-— | 

Evidently, a 1:1 correspondence similar to (8) yields a characterization of 
any simple s.p. of size n in terms of the ‘“‘distances”’ of its boundary points from 
the line zx + y = n. The vectors (a, --- , @n4:) depend on both (t,, --- , tas) 
and (b, , --- , b,-:) in this case, but the method as well as the conditions satis- 
fied by (a,,--- , @n4:) can be easily derived. Since the lattice-theoretic ideas 
developed in [2, 3] yield a simple 1:1 correspondence between the vectorial 
representations (using boundary points) of simple s.p.’s of size n and simple 
symmetric s.p.’s of size 2n, we obtain without further calculations another 
proof of our theorem. The characterization (7) of s.p.’s and their interpretation 
as a distributive lattice applies with little change to other problems in prob- 
ability theory, and yields a unified approach for rederiving and extending many 
results. [cf., 2]. 


‘ . J . ‘ . 1 3n 
of simple symmetric s.p.’s of size 2n is n a 


Acknowledgment. The authors are grateful to the referee for his helpful sug- 
gestions. 
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Math. Bull., Vol. 1 (1958), pp. 169-173. 


a 


AN INEQUALITY FOR BALANCED INCOMPLETE BLOCK 
DESIGNS 


By V. N. Murty 
Central Statistical Organization, New Delhi 


1. Summary. For a resolvable balanced incomplete block design, R. C. Bose 
[1] obtained the inequality b = v + r — 1, and P. M. Roy [2] and W. F. Mikhail 
[3] proved this inequality without the assumption of resolvability, but with the 
weaker assumption that v is a multiple of k. In this note an alternative and 
simpler proof of Roy’s theorem is given. 
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2. Proof. A B.I.B. design is an arrangement of v treatments in 6 blocks of 
size k < v such that (i) every block contains k distinct treatments, (ii) every 


treatment occurs in r blocks and (iii) any two treatments occur together in 
d blocks. 


The parameters satisfy 
(2.1) 
(2.2) 
(2.3) 
From (2.2) we have 


(2.4) r/(v — 1) = A/(k — 1) = (r — A)/(v — &). 


If now we assume that v is a multiple of k, v = nk, we have from (2.4) 


r/(v— 1) = (r — A)/(w — k) = (r — XA)/(k(m — 1)), 
(r(n — 1))/(v — 1) = (r — X)/k 


(2.5) 


Putting v = nk in (2.1), we have 6 = nr, so that (2.5) can be rewritten as 
(2.6) (r — A)/k = (b — r)/(v — 1). 

Rewriting (2.2) after expansion we have r — \ = rk — vd, and (r — A)/k = 
r — nd. Thus 

(2.7) (r — A)/k = (b-—1r)/(v—1) = r—ny. 


Since n, r, \ are all integers, r — md is an integer, from which it follows that 
the other two ratios in (2.7) are integers. It can easily be seen that they must 
be positive integers since r > \ and k is a positive integer. Therefore 


(b—r)/(v-—1) 21 
andb=v+r-—1. 


REFERENCES 
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[3] Wapre F. Mrxuart, “An inequality for balanced incomplete block designs,’’ Ann. 
Math. Stat., Vol. 31, (1960), pp. 520-522. 
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THIRD ORDER ROTATABLE DESIGNS IN THREE DIMENSIONS: SOME 
SPECIFIC DESIGNS! 


By Norman R. Draper? 
Mathematics Research Center, Madison, Wisconsin 


0. Summary. A recent paper [3] described a method for constructing infinite 
classes of third order rotatable designs in three dimensions. This note gives nine 


specific designs selected from some of the infinite classes that have been shown to 
exist. 


1. Preliminary remarks. Two papers by Bose and Draper [1] and Draper [2] 
dealt with the formation of second order rotatable designs in three [1] and in 
four or more [2] dimensions. It was shown there that certain point sets could be 
combined in such a way that infinite classes of designs, which included as special 
cases all previously known designs, could be formed. In further work [3] it has 
been established that certain pairs of the second order design classes found in 
[1] could be combined in such a way that infinite classes of third order rotatable 
designs for three factors were formed. Gardiner, Grandage and Hader [5] found 
four particular third order designs for three factors and it was shown [3] that 
two of their designs were the extreme cases of an infinite class. Several other 
infinite classes of third order rotatable designs have been tabulated in unpub- 
lished work by the author. (All are combinations of the infinite classes of second 
order rotatable designs given in [3].) This note presents nine specific designs, 
chosen from these infinite classes. The calculations by which they were obtained 
are illustrated elsewhere [3] and have not been repeated here. 

The notation of [3] has been used below without further explanation; in particu- 
lar, the symbol D; refers to a second order rotatable design class to be found in 
Table II, on page 871 of [3]. The third order rotatable designs listed below con- 
sist of two separate second order rotatable designs, one each from two of the 
design classes in the table mentioned. The values of N, \2N, \sN, AsV, \4/A3 and 
\sho/A4 for each complete third order design are provided. Since it will be desirable 
to provide extra center points in some designs, N has been given in the form 
(number + mo). 

It will be noticed that the designs are given in terms of a parameter 6; thus 
they are not scaled in the usual sense (i.e., \» #1). If desired, however, 6 can be 
chosen so that A.» = 1 in order to make use of formulae given in [5] for the inver- 
sion of certain matrices needed in the analysis of third order rotatable designs. 
Formulae suitable when \, #1 are given in [4]; in this case, it is most convenient 
to set 6 = 1. 
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2. Nine third order rotatable designs. 
{D, with a = @,c, = @ = 2'0 
\D, with f = 2'0,a = 0,c¢ = 20 

MN = 486° 

uN = 326° 

MV = 160° 


(1) 


N = 46 ob No 
h/A3 = N/72 
Nedo/AG = 2 


(De with Qa=—- a= 6, c = 20 

\Ds with Q=-a== f = 29 
AN = 486° 
UN = 326 
MN = 160° 


(2) 


N = 46 ae No 
e/A3 = N/72 
Mede/AG = 2 


(20 points) 
(26 points) 


(22 points) 
(24 points) 


(Note: Designs (1) and (2) are identical if the overall third order design is con- 
sidered. The split into two second order designs is, however, different in the two 


cases. ) 


- { Dz with a, = 0.55780, a2 
°/ \D, with f = 0.72010, a = 


AN = 11. j N = 48 + mm 
rN = 2: d4/A3 = 0.0158 N 
AN = 0.2785 Aede/AZ = 0.74102 


{De with a; = 0.58780, a2 = 0.27390, c = 0 
\De with p = 0, g = 0.66090, c = 0.93476 
AN = 14.09996 
uN = 2.52636 
NV = 0.33330° 


N = 52 + No 
d4/AZ = 0.0127 N 
Nedo/Ai = 0.73643 


SDs with f =C( —- a= 6 
\ Dy with f = 0.93570, a = 0.86460, c = 1.56536 


AN = 29.88500 
WN = 11.53686 
AN = 3.34210° 


N = 50 + No 
hs/AZ = 0.0129 N 
Aede/AZ = 0.75043 


s 


iDs with f =SC = a = 6 
| Ds with p = 1.40786, ¢ = 0.51120, a = 0.78320 


\N = 36.94470 
XN = 15.84356* 
NV = 5.09490° 


N = 56 + No 
d4/AZ = 0.0116 N 
Aede/AZ = 0.74986 


7) D; with f = q = @ = 0 
(4 \Ds with p = 1.52050, q = a = 0.59800 


(22 points) 
(26 points) 


(22 points) 
(30 points) 


(24 points) 
(26 points) 


(24 points) 
(32 points) 


(24 points) 
(32 points) 
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TABLE 1 


Values of the parameters for the component second order rotatable designs 


Third Order 


este Components A2N 


D; 


1.0756 6 
64 
1.5263 64 


17.8850 6? 
12 @ 
24.9447 6? 


12 @ 


5368 64 
4 64 
8436 64 


464 


da/AZN 


0.0294 
0.0341 


0.0348 
0.0200 


] 
36 


0.0236 


1 
36 
0.0190 


] 
86 


0.0208 
10.8284 # 4 6 0.0341 
26.2681 6? .9930 64 0.0217 


27 .0797 @? 5.2772 64 


4 6 a's 24 + 
5.9998 6 0.0202 30 + 
* Singular, center points essential; center points are also desirable in some of the other 
designs in which ),/A3 is not much greater than the singular value of 0.6, when no = 0. 


AN = 39.07970° N 
\N = 19.27726 
NV = 7.46410° 


= 56+ np 
h4/AS = 0.0126 N 
NeAo/As = 0.78493 


f 


(8 (met 0, Cy = 2'6,c=0 
©) Ds with p = 1.51670, q = 0.60378, a = 0.50420 


NN = 37.09650° N = 56+ 
MN = 18.99306 \4/A3 = 0.0138 N 
Nedo/AG = 0.76756 


AN = 7.46390° 


(24 points) 
(32 points ) 


( 


™ with f= aq = @ = 0 
\ De with p = 0.98480, g = 0.57488, c 


(24 points) 


(9 : 
’ (30 points) 


1.44530 
AN = 29.22300 
uN = 9.99996" 
AsV = 2.54050° 


Nedo/AZ = 0.74244 
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3. Warning. The \-values quoted above are those that apply for the complete 
third order rotatable design. If one second order part of the whole design is run 
first and a second order regression is performed, the \-values appropriate to it 
must be used in the second order X’X matrix. These are provided in Table 1. 
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ABSTRACTS OF PAPERS 


(Abstracts of papers presented at the Annual Meeting of the Institute, Seattle, Washington, 
June 14-17, 1961. Additional abstracts appeared in the June, 1961 issue.) 


18. Tables for the Reliability of Repairable Systems with Time Constraints 
(Preliminary report). Roquez Besarano AND Ronatp S. Dick, Inter- 
national Electric Corp., Paramus, N. J. 


Tables have been prepared to solve for the reliability of systems composed of A similar 
subsystems of which at most N can be inoperable for periods exceeding ¢) time units. A 
second time constraint is introduced into the model so that for at least time ¢, following a 
return of the system from state N + 1 to N machines inoperative, the system is only in 
states 0 to N or the system fails. 

The mixed difference-differential equations solved are of the forms: 


Pi(t) = —[p; + AMPs (t) + Aj-aP 5-1) + jasPjai(t) + wwasPancy th)iPwast — &)) 


P;(t) —[ey + AMPs (t) + Aj-1P yan (t) + wjarPjir(t) — Aw[Pw(t — to) Pavan i (lo). 


PT' (t) — py iP wy i) (t1) [Pwaice—gy] — (ag + AG) PSE) + wjarPRia(t) + Ay -1PRA(t) 


where appropriate boundary conditions are applied. Reliability is defined as R(t) = 
dofAo Py (t) + >%o Ph (t). For A = 1 (1) 5, and N = 0(1)A — 1, the tables give for 81 
combinations of \ and uw the approximate time at which R(t) = .001, .005, .01, .05, .10 as 
well as the MTBF. The Cornish-Fisher equation and Weibull approximations are used in 
finding the reliability points. The MTBF is found by evaluating the Laplace Transforms 
of the mixed-differential difference equations and is exact. Reference should be made to 
“The Reliability of Repairable Complex Systems, Part A: The Similar Machine Case’’ by 
R. 8. Dick, 5th Mil-E-Con National Convention on Military Electronics, 1961 for a com- 
plete set of equations solved in this paper and the details of the model. 


19. Mutual Information and Maximal Correlation as Measures of Dependence. 
C. B. Bet, San Diego State College. 


Kramer (1961) asks if Shannon’s mutual information, Cp , is equivalent to Kramer’s 
generalization (to arbitrary o-algebras) of Gebelein’s (1939) Maximal Korrelation, Sp, 
which satisfies Rényi’s (1959) postulates for a dependence measure of pairs of random 
variables. It is found that for two normalizations Cp and C> of Cp: (1)0 S Sp, Cp : 
Cp 3 1; (2) Sp = Oiff Cp = 0 iff Cp = 0 iff the algebras are independent. For strictly posi- 
tive probability spaces, (3) the algebras are set independent iff there exists a probability 
function P; such that Sp, = Cp, = Cp, = 0; (4) Ce = 1 iff one algebra contains the other; 
(5) C> = 1 iff the algebras are equal; (6) Sp = 1 if the algebras have a non-trivial inter- 
section; (in the finite case, the converse of (6) holds;) (7) there exists a probability space 
such that no two of the dependence measures are equivalent. Open Problems: Which of 
(3)-(6) are valid for (a) the Gelfand-Yaglom (1957) mutual information for non-atomic 
algebras generated by random variables; and (b) the Lloyd mutual information for arbi- 
trary algebras? 
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20. On a Necessary and Sufficient Condition for a Set of Jointly Normal Vari- 
ables to have a Common Variance and a Common Covariance (Prelimi- 
nary report). B. R. Baar, University of California, Berkeley. 


The following theorem is proved. Let 2; (i = 1, 2, --- ,n) have a joint n-variate normal 
distribution with mean 0. Then the necessary and sufficient condition that n#* = (>- 2;)*/n 
and >~ (2; — £)? are distributed independently and the latter as cx?, where c is a constant, 
is that Var 2; = o? and Cov (z; , z;') = v(i, i’ = 1,2, --- ,n). The sufficiency of this theorem 
is well known. The necessity follows from the facts that if X is N(O, =) (i) X’AX and X’BX 
are distributed independently if and only if A = B = 0 and (ii) X’AX has a cx? distribution 
if and only if cA = A&A (ef., C. R. Rao, Advanced Statistical Methods in Biometric Re- 
search, p. 56). It is also proved that, if 2; , y;(i,j = 1, 2, --- , n) have a joint 2n-variate 
normal distribution, then Q = >> (x; — 2)? + bs (y; — 9)? is distributed as cx? inde- 
pendently of £* and gj if and only if Var x; = o., Var yj; = a: , Cov (a, te) =n, 
Cov (y;, yi) = v2, Cov (a, yj;) = v3 (i, 7’,j,j7’ = 1,2, ---,n). In particular, 
( — ¥)[n(m — 1)/Q)}* has a t-distribution with 2n — 2 d.f., if further v; = 4(v; + v2). 


21. A Property of Least Squares Estimator in Regression Analysis when the 
Independent Variables are Stochastic. P. K. Baatrracuarya, University 
of North Carolina. (Introduced by 8. N. Roy.) 


(Xi, ++: ,X,, Y) follows a (p + 1) variate distribution which is assumed to satisfy 
the following conditions: (i) for every non-null (ao , a1 , «++ , @p), the set 


{(t1, *** 5 Sp, Y)i@o + aim + +++ + pry = 0} 


has probability zero, (ii) E(X;X;-) is finite, j, j’ = 0, 1, ---, p, Xo = 1, (iii) 
E[Y | Xi, --- , Xp] is a linear function of X,, --- ,X,, (iv) V[Y | Xi, ---, Xp] is a 
finite constant. n = p + 1 independent observations are made on (X,, --- , Xp, Y) and 
the loss in estimating the true regression function ¢(m , --+ , Zp) = E[Y | 2, --+ , 2] by 
another function y(m , --- , Zp) is W@,¥) = J [6 — wy)? dF where F(z, --- , z») is the 
marginal distribution function of X; , --- , Xp. Let © be the class of all estimators which 
are linear in Y’s and have bounded risk. Then the estimator obtained by the method of 
least squares belongs to © and has uniformly minimum risk in C if and only if all the 
elements of the inverse of the matrix of normal equations, have finite expectations. This 
last condition is not satisfied in general, and in particular, for p = 1 and for a normal dis- 
tribution of X, , it is satisfied if and only if n 2 4. 


22. Selecting the “Best” ¢ out of k Populations. P. K. Buarracnarya, Uni- 
versity of North Carolina. (By title) (Introduced by 8. N. Roy.) 


F(z, @) is a family of continuous distribution functions admitting density functions 
f(z, 0) and g(@) is a real valued function satisfying the following conditions: (i) for @. > 4; , 
f(x, 02)/f (x, 6) is a monotonically increasing function of z, (ii) g(@) is a monotonically in- 
creasing function of 6. Suppose X, , --- , X;% have distribution functions 


Fi(z) = F(a, 0), --+ , Fe(z) = F(x, Ox) 


respectively, each of which belongs to the above family and one observation is made on 
each of X,, --- , X;, the observations being independent. Let C(6,, --- , 0.) be the 
sum of the largest ¢ of the quantities g(@), --- , g(@%). A vector (d, , --- , dk), dj = Oorl, 
>i d; = t, represents the decision for selecting the random variable X; if and only if 
d; = 1, and the loss in taking the decision (d; , --- , dk) when (6; , --- , 6%) obtains, is 
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C(O, , +++ , OK) — >i d,;@; . It has been shown that the decision function 5* defined below 
is admissible and minimax, —4*(z) = (8 (xr), --- , de (z)), where 4; (z) = 1 if 2; is one of 
the largest t of z, ,--- , x, = 0 otherwise. 


23. Approximations for the Entropy of Functions of Markov Chains. Joun J. 
Bircu, University of Nebraska. (By title) 


If {Y,} isa stationary ergodic Markov process taking on values in a finite set {1,2,---, A}, 
then its entropy can be calculated directly. If ¢ is a function defined on 1, 2, ---, A, with 
values 1, 2, ---, D, no comparable formula is available for the entropy of the proc- 
ess {X, = #(Y,)}. However, the entropy of this functional process can be approximated 
by the monotonic functions 


Ga = h(Xn| Xu-1, -+- , X1) and Gn = h(Xn| Xn-1, °-- , X1, Yo), 


the conditional entropies. Furthermore, if the underlying Markov precess { Y,} has strictly 
positive transition probabilities, these two approximations converge exponentially to the 
entropy H, where the convergence is given by 0 S G,. — H <= Bo*' andOsSs H-G, <s 
Bp" with 0 < p < 1, p being independent of the function ¢. 


24. Some Properties of a Large Set of Random Signals (Preliminary report). 
NELSON M. BiacuMaAn, Sylvania Electronic Defense Labs. (Introduced 
by Emanuel Parzen.) 


For a communication channel that accepts n-tuples of real numbers of mean-square 
value P as input signals and delivers them with each component perturbed by the addition 
of independent, zero-mean normal noise of variance N, M different signals can be distin- 
guished with an error probability approaching zero as n — ~ provided tan? @ > N/P, 
where sin 6 = M~'/". To achieve this result, it suffices to choose for the signals the rectan 
gular coordinates of independent random points s; , --- , 84 on the surface of an n-sphere 
of radius (nP)* centered at the origin O. When a perturbed n-tuple r is received, the most 
likely signal is that corresponding to the nearest s; . Thus, the space of all r is divided into 
M convex, pyramidal regions R, , --- , Ry , with R; consisting of all points closer to s; 
than to any other s; . We find, e.g., that nearly every s; within an angular distance 26 of 
8; contributes a face to R; . In a random direction from s; , with probability approaching 1, 
R; extends out very nearly just to the circular cone of generating angle @ with axis Os; . 
This cone very closely circumscribes R; and nearly every one of its faces, edges, etc. Nearly 
all of R,’s surface is accounted for by faces approximately arc sin (2 sin @) from s; . The 
nearest edge of R; of dimensionality n — k is approximately ¢; from s; , with sin* ¢, cos ¢& = 
ki (k + 1)-i*) sin’ 6. From such results, we obtain lower bounds on the noise variance 
that could result in a large error probability if a portion of the noise should be dependent 
on the signal being transmitted and on 8; , --- , 8y. 


25. On a Problem in Hilbert Space with Applications. J. R. Buum anp D. L. 
Hanson, Sandia Corporation. 


Let {X, ,n = 0, +1 ---} be a stationary stochastic process. Then it is known that a 
necessary and sufficient condition that the process be pure nondeterministic is that the 
spectral distribution of the process be absolutely continuous and that the logarithm of 
the spectral density be integrable. In this paper we obtain necessary and sufficient condi- 
tions directly on the covariance sequence. Several related problems are discussed. 
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26. Length of the Longest Run of Consecutive Successes. E. J. Burr, Uni- 
versity of New England, Armidale, N.S.W., Australia. (Introduced by 
D. B. DeLury.) 


In an ordered sequence of n observations, let those possessing a specified attribute be 
called ‘‘successes’’, and those not possessing it “‘failures’’. A conspicuous feature of such 
a sequence is the length k of the longest run of consecutive successes observed. On the 
hypothesis that the successes and failures occur in random order, all permutations being 
equally probable, we derive formulae for the probability that the statistic k should exceed 
any specified value (i) when the probability of success in each trial is given, (ii) when the 
numbers of successes and failures are given, (iii) when the sequence is circular with no 
preferred initial point. The joint distribution of the lengths of the longest success run and 
longest failure run is also derived. The treatment is greatly simplified by introducing the 
concept of a success run of length zero. 


27. Comparing Distances between Multivariate Normal Populations, I (Pre- 
liminary report). THkopHimos CacouLLos, Columbia University. (By 
title) 


Let +; be p-variate normal populations with means yu‘: = 0,1, --- , k, respectively, 
and with the same known covariance matrix ©. The yw“, i = 1, --- , k&, are known and p™ 
is unknown. Let Az; = (un — p)’E-1(u — wp) denote the generalised (Mahalanobis) 
distance between 7; and x; . On the basis of a sample 2, , --- , 2, from 7m a population 
™,i = 1,--+, k, is to be selected so that Aix = MIN) < j<k Ai; . Let d; be the decision of 
selecting 7; . (1) Assume that the wu“, i = 1, --- , k, are collinear. Then by invariance 
under linear transformations on the p-space the problem reduces to locating the mean of 
a normal variable with unit variance into one of k consecutive intervals covering the real 
line. Hence the theory of monotone procedures for the exponential class of distributions 
(Girshick and Blackwell, Theory of Games and Statistical Decisions, pp. 179-193, John 
Wiley and Sons, New York, 1954) applies. Let # be the sample mean and 4;;(%) = 
(2% — p® — p)/2- (wD — wp), i, 7 = 1, --- , k. Then, e.g., for k = 2 the family of de- 
cision rules: take d, if 6:2.(#) < c, take d; otherwise, —«» < c < +, is minimal complete 
for a wide class of loss functions. (2) Suppose that the k points u™, --- , uw are vertices 
of a (kK — 1)-simplex in p-space (p = k — 1). Define 6(Z) = (6:2(%), --- , 5:%(#))’ and simi- 
larly 6(u). Then 6(#) has a (k — 1)-variate normal distribution with mean 5(u) and 
known covariance matrix A, say. If xi:(a@) denotes the 100a percentage point of a x?- 
distribution with k — 1 degrees of freedom, then of all level a tests of the hypothesis 
Aj, = Age = +++ = Ae with power depending only on nd’ (u )A—"5 (u ) the test with critical 
region nd’ (Z)A—5(Z) > xk-1(a) is uniformly most powerful. If 6;;(n) s —nrAi; , for all 
j = %i,0 < S 1,is the region where d; is the correct decision,i = 1, --- , k, then a unique 
minimax solution is found for constant loss functions. 


28. Subsamples and Order Statistics (Preliminary report). J. T. Cau anp 
Kamat Ya’Covus, University of Pennsylvania. 


Suppose that a random sample of size mn is drawn from a given distribution and the 
sample is divided into m subsamples each of size n. The observations in each subsample 
may be arranged in order of magnitude. In this way, we obtain m order statistics each of 
size n. For subsequent analysis, a subset of observations may be selected, as representa- 
tives, from each of the m order statistics. Furthermore, by combining the jth order statistic 
of each subsample, one obtains a random sample of size m from the population of the jth 
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order statistics in samples of size n drawn from the parent distribution. Various types of 
statistical inference based on such divisions, orderings, and selections of a random sample 
are being investigated. A number of devices have been found where savings in time and 
computation compare favorably against loss of accuracy. Methods which improve the ef- 
ficiencies of existing ones are also found. 


29. Percentile Estimators for the Parameters of the Exponential Failure Law. 
Satya D. Dusey, Procter and Gamble Co. (By title) 


For the 2-parameter exponential failure law, the percentile estimators of the location 
and the scale parameters, based on at most two percentiles, have been derived under three 
different possible cases. The sampling and the asymptotic distributions and the expressions 
of the kth moments of these percentile estimators have been obtained. The choices for the 
cumulative probabilities have been made in such a manner that the corresponding per- 
centiles insure asymptotic minimum variance unbiased percentile estimators of the loca- 
tion and the scale parameters. In case both the location and the scale parameters are un- 
known, the concept of the generalized variance, which is defined as the determinant of the 
variance-covariance matrix, has been used to determine two cumulative probabilities en- 
suring minimum generalized variance. The smallest sample observation and the 80th per- 
centile seem to provide asymptotically most efficient percentile estimators for both the 
parameters of the exponential distribution. 


30. On Separating a Deterministic Component from a Stochastic Sequence. 
FRIEDHELM Ecker, University of North Carolina. 


In the separation of a deterministic component of the form of a linear regression Yé 
from a stochastic sequence y; , y2 , --- the attention has been focussed almost exclusively 
on the estimation of 5. It can easily be seen, however, that often some methods applied for 
this estimation cannot be used at the same time for an estimation of the stationary sequence 
{z:} in the assumed model y = Yé + z. So, for instance, the least squares estimators 6 of 
6, though consistent, may asymptotically not even allow a stationary sequence at all. 

In order to make an estimation of 6 possible {x} must be submitted to some assumptions 
such as (to stay quite general) (a) weak stationarity, with a possibly non-zero mean func- 
tion E(x.) = w:, (b) finite second moments E(z?) < const < « only. Through (b) in a 
sense the most general class, say S, applicable in the model is described. A sufficient condi- 
tion for consistency (in the sense of mean square convergence) of 4, given a certain 
stationary sequence with covariances R(m), is > i mis |R(m)|/rmin (Y y Yw) — 0. For con- 
sistency over the whole class (a) (or even over S), N- "min (Y y Yn) — @ is sufficient. How- 
ever, one easily finds regression matrices Yy of this kind, yet the ‘‘covariance function”’ 
Ri(m) = E (4t4m#+) of the residuals does not tend to the true one (not to mention an esti- 
mation of the entire sequence {z;} at all). 


31. On Pairwise Independence. Seymour GrIsseER AND NATHAN MANTEL, 
National Institutes of Health. 


It is well known in statistical theory that pairwise independence is necessary but not 
sufficient for a set of p variables to be mutually independent. The example that is usually 
cited in the statistical literature is due to 8. Bernstein and involves discrete variables. An 
example of continuous variables that exhibit this peculiarity is produced and is simply the 
joint distribution of correlation coefficients from a multivariate normal distribution with 
a diagonal variance-covariance matrix. 
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32. On Tests with Likelihood Ratio Criteria in Some Problems of Multivariate 
Analysis (Preliminary report). N. C. Giri, Stanford University. (Intro- 
duced by Charles Stein.) 


Let X be a p-dimensional column vector having multivariate normal distribution with 
unknown mean ~ and unknown non-singular covariance matrix 2. In this paper we have 
considered two different testing problems concerning mean é and = viz., 

(i) to test the hypothesis that é lies in Z against the alternative that & lies in Y where 
Z and ¥ are subspaces of the parametric space of dimensions p — p’ and p — q respectively 
(p> p’'>4q); 

(ii) to test the hypothesis that 2~'-é lies in Z’ against the alternative that 2-'-é lies in 
‘Y’ where Z’ and ‘Y’ are subspaces of the adjoint space 9X’ of the space of z’s, of dimensions 
qand p’ (p> p’ > q) respectively. 

It has been shown that the likelihood ratio test for problem (ii) is uniformly most power- 
ful invariant similar; whereas if the sample size N and p’ and q are large then the likelihood 
ratio test for problem (i) is nearly uniformly most powerful invariant. 


33. Circular Probability Problems. Witu1am C. Guentuer, The Martin Co. 
and University of Wyoming. 


When a circle C, of radius R is dropped upon a fixed circle C2 of radius D, several inter- 
esting and useful problems arise. Two of these involve (a) the probability that C, covers a 
randomly selected point within C., and (b) the probability that C, covers a randomly 
selected point on the circumference of C; . The first problem has been considered by others 
and results may be found in Rand RM 330. The second problem is considered and the rela- 
tionship between the two problems is observed. The three dimensional counterparts are 
also considered. Tables are included. 


34. An Application of the Sequential Probability Ratio Test to Finite Populations. 
Paut Guntuer, Armour Research Foundation. 


Let 2 , --- , 2, be the values assumed by a finite population consisting of n members. 
It is desired to derive a sequential procedure to predict whether yn = > 7.1 2; is = C or 
< C,where C is specified. It is assumed that the finite population is in turn a random sample 
from a normally distributed superpopulation f(z; ; 6) with unknown mean @ and known 
standard deviation o. (This can be considered also in the sense of an a priori distribution. ) 
Further, @ is assumed to be equal either to 6; or 4 (0; > 60), each with (a priori) probability 
4. Define a = Prob (predicting yn 2 C/yn < C), as determined from the a priori distribu- 
tions, and similarly for 8. The SPRT leads to withholding a prediction and taking further 
observations if B < {f(yi/yn 2 C)/[f(ys/yn < C)]} < A where y; = > j-1 z;;A and B 
are determined in the usual manner; and f(-) is weighted by 0 and @, . If C = n$(0, + @), 
the acceptance numbers A; are determined from the equation exp (—27;D;) = 
((1 + A) — N(—T; — D;))/(N(—T;i + Di) — (1 + A)~"); where 


T; = (A; _ i6)/((n aa i)io), D; ” (n — i)tA/o, 6 = (6; + 09) /2, 
A = (6; — 60)/2, N(-) = cumulative normal] distribution. 


This equation is easily solved graphically. As n — «, the test approaches the usual Wald 
SPRT. A numerical application is made to a problem in “‘budget control.’’ A derivation 
is also possible without resorting to the a priori probabilities of 4 and 6; . 
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35. On Step-Down Procedure in Simultaneous Multivariate Analysis of Variance 


(Preliminary report). P. R. KrisHnaran, Remington Rand UNIVAC. 
(By title) 


Let X denote an n X p matrix whose rows form n independent vectors having p variate 
normal distributions with a common covariance matrix = and means given by E(X) = M@é, 
M:n X m, 6 = m X p, rank M = r S m S n, where M has known elements and is called 
the ‘‘design matrix,’’ and @ is unknown. Now, consider the K orthogonal hypotheses 
H;:C,@ = 0,1 = 1, 2, --- , K, against the alternatives A;:C;@ = 7; . If the variates can be 
arranged in some order according to their importance, the hypotheses H; , --- , Hx can 
be tested simultaneously by using the ‘‘Step-Down Procedure’’. This procedure was first 
used by Roy and Bargman (these Annals, 1958) for testing the hypothesis of multiple inde 
pendence of sets of variates when the parent population is multivariate normal. In the 
present paper, the tests of significance are derived for testing the hypotheses H, , --- , Hx 
simultaneously by using the “Step-Down Procedure’’. The confidence bounds on meaning- 
ful parametric functions are also derived. The extension of these results to random models 
is under investigation. 


a4 
” 


6. Linear Hypothesis with Linear Restrictions. ANpré G. Laurent, Wayne 
State University. 


Let Y be weakly spherical (or make it so by changing the definition of the inner product) 
with E(Y) = uw = Ad, where A isn X k of rank r S k, with column space Q, restricted by 
Le = O, L’ = (R’, P’), with R@ nonestimable, where R is s X k of rank s S k — r and 
P@ estimable where P is t X k of rank t S r and restricts u to w. Completing Ré = O to 
nonestimable R*@ = O, where R* is (k — r) X k of rank k — r, makes @ (hence any M@) 
**pseudo estimable”’ i.e., with structure (C, u), C ¢ Q (or w) under & (or w), with best esti- 
mate 6 = (C, Y). Let L*’ = (R*’, P’). 


6,\ _ (A’A L*\(A’Y ae Siti 
(3) - ( L* % ( O ) 6, = Bi A’Y Me, = M6, 


with A’AB; + L*’B) = I and B}L* = O. Several proofs of the above equations are given 
as well as geometrical interpretations. If A is of full rank, R = O, obtaining 6,, is straight- 
forward. These results extend and generalize those of P. Dwyer and others. 


37. The Effect of Convergence to Normality on Tests of Hypotheses. Luoyp J. 
MONTZINGO AND NorRMAN C. Severo, University of Buffalo. (By title) 


Let X be a random variable with mean yu, and standard deviation o, , and let the dis- 
tribution of X tend to normality as some function of the parameters n(u. , ¢:) — no . The 
result of applying normal theory tests of hypotheses on the mean or variance to a sample 
from the distribution of X is considered. Denote by P, the power of a test based on a sample 
of size n (n fixed) from the distribution of X, and by P the power of the test if X were 
normally distributed. Then sufficient conditions are given for which P, — P 
n\Me » Fz) > MM. 


as 


38. On the Asymptotic Normality and Independence of the Sample Partial 
Autocorrelations for an Autoregressive Process. V. K. Murruy, Stanford 
University. (By title) 


‘or a stationary autoregressive model of order s, the partial autocorrelation coefficients 
For a stationary aut r model of order s, the partial autocorrelatior fficients 
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of order j, j = 0,1,2, --- , 8s — 1 are defined; the partial autocorrelation coefficient of order 
zero being the same as the autocorrelation coefficient of order one. Denoting these s param- 
eters by pi, 71, °** , ts-1, it is shown in this paper that their sample images namely 
Ti, Pi, °** , De. are asymptotically independently normally distributed with means equal 
to the corresponding population values and asymptotic variances given by 


Var (r:) = n-“(1 — pi)(L — mi) --- A — 1-1); 
Var (p;) = n“(1 — x3) (1 — aja) «++ (1 — iu), j=1,2,---,8—1, 


where n is the size of the sample from the autoregressive process of order s. The partial 
correlogram of the model and application of the result are discussed. 


39. On Fitting a Linear Trend and Testing Independence when the Residuals 
Form a Markov Process. V. K. Murruy, Stanford University. (By title) 


In this note we are studying the problem of fitting a linear trend when the residuals are 
serially correlated according to a first order Markov scheme. An iteration method for solv- 
ing the maximum likelihood equations is proposed and an explicit criterion for the con- 
vergence of the iteration process is obtained. It is incidentally shown that for large samples 
the serial dependence may be neglected and ordinary least squares analysis used for esti- 
mating the trend. A general result in this direction was proved by Herman Wold and Juréen 
Lars, [Demand Analysis, John Wiley and Sons, New York (1953)]. The asymptotic variance- 
covariance matrix of the maximum likelihood estimates and the likelihood ratio criterion 
for testing p = 0 are obtained. Extending a result of Ogawara [Ann. Math. Stat, Vol 22 
(1951), pp. 115-118], the problem of regression when the residuals are serially correlated 
according to the first order Markov scheme, is reduced to the classical case; in the case of 
fitting a linear trend exact tests for the regression parameters and the hypothesis p = 0 
are derived. Illustrating the iteration method an application to the data on the average 
yield per acre of potatoes from 1903 to 1932 of the United States is worked out. 


40. On the Cumulants of a General Renewal Process. V. K. Murruy, Stanford 
University. 


In this paper the results of Smith on the cumulants of a Renewal Process are extended 
to the case of a General Renewal Process. After establishing the asymptotic representation 
theorems for the -moments and ¥-cumulants of a General Renewal Process, the table of 
the first eight cumulants of a Renewal Process has been extended to the case of a General 
Renewal Process. A theorem is proved leading to a check on the calculations. As a particu- 
lar case of the General Renewal Process, the cumulants of the ‘‘Equilibrium Process’’ are 
obtained. 


41. Some Distribution-Free Multiple Comparison Procedures in the Asymptotic 
Case. Perer NeMENyI, 8.U.N.Y. College of Medicine at Brooklyn. 


By means of a generalization of Stuart’s transformation for correlated variables [J. 
Roy. Stat. Soc., Ser. B, Vol. 20 (1958), 373-378] and by an alternative method, it is shown 
that existing tables for multiple comparisons of normal means also apply to a large family 
of asymptotic permutation procedures, including Steel’s rank and sign tests, some median 
tests, and multiple comparisons based on Kruskal-Wallis rank totals. The tests, designed 
for translation alternatives, can also be adapted to the problem of differences in spread 
(but in this case it is more difficult to obtain confidence intervals). 

The tabulation of (1 — a)'/* and $[(1 — a)"/* + 1] points of various one- and two-sample 
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statistics (e.g., for setting simultaneous sign-test confidence intervals on k median treat- 
ment effects) is also advocated, and some tables are provided. 


42. Formulation of a Model Containing a Chance Mechanism according to 
which Observations are Missed: The Randomized Block Design. JuNsrro 
Ocawa, Nihon University, Tokyo, Japan anp BerNarp 8. PASTERNACK, 
New York University Medical Center. (By title). 


In this paper an attempt has been made to introduce a chance mechanism according to 
which observations are missed into the model and subsequent analysis of the randomized 
block design. By partitioning the ‘‘design matrix’’ for the randomized block design into 
(@:W), it is possible to incorporate the process of randomization for this design directly 
into the theoretical model. The exact distribution of the sum of squares due to treatments 
contingent upon the incidence matrix for treatments, #, being random, and the incidence 
matrix for blocks, ¥, being fixed, can then be rigorously obtained. 

When there is an a priori probability that observations may be missing in a randomized 
block design, this may be accounted for in the model by introducing a set of mutually inde- 
pendent chance variables 


(0 with probability p. if x, is missing 


me = \1 with probability 1 — pu. = q. if z, is not missing. 


The vector m’ = (m,, mz, «++, mn) is called the missing observation vector. The probability 
distribution of m’ is given by []%.1 p\7"™ gu". On the basis of this model, a modified (F) 
test statistic is obtained. The extension of this approach to other more complex designs is 
formal. Whether or not it is possible to obtain the exact or approximate distribution of this 


statistic is, at the moment, an open question. 


43. On Some Methods of Estimation for the Logarithmic Series Distribution. 
G. P. Patit, University of Michigan. 


Applications of logarithmic series distribution have been discussed among others by 
Fisher (1943), Williams (1943, 1944), Harrison (1945) and Kendall (1948). Problem of esti- 
mation, however, does not seem to have been thoroughly investigated. This paper provides 
different estimates for the parameter of the logarithmic series distribution and investi- 
gates their efficiency and the amount of bias in certain special cases. 


44. Some Asymptotic Properties of the Negative Binomial Distribution. Vivian 
Pesstn, Children’s Hospital, Buffalo, N. Y. (By title) (Introduced by 
Norman C. Severo.) 


The following two theorems have been proved: 


Theorem 1. The negative binomial frequency function is asymptotically normal as \/a — 
©. Let 


r z am 
enna NL EGt A, - tyidoarpine 
l+ea x 


Let mo = [(A — 1)/a] — w((A — 1) (1 + @))#/oa, oo = ((A — 1) (1 + @))#/oa, where o is an 
arbitrary constant > 0, and yu is an arbitrary constant. Let y = (x — mo)/oo. Then, for 
fixed z, and for a bounded away from 0 and from ~, 


lim) jae Q(n = y) = o (2x)? exp (—(y — u)?/20?), —-e <cy< @, 
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Theorem 2. The negative binomial frequency function is asymptotically the Gamma 
frequency function as a — 0, for fixed \ such that 0 < A < 1. Using the same notation as 
in the first theorem, let m, = A — 1, «1 = k((1 + a)/a), where k is an arbitrary constant 
> 0. Then lima+o Q(n = y) = (k*e~*y")/(T(A — 1)) = g(y), which becomes the gamma 
frequency function when g(y) is defined to be 0, for y S 0. 


45. On Horvitz and Thompson’s 7'-Class Estimators (Preliminary report). 
S. G. PraBuu-AJGAONKAR AND B, D. Trkxkrwat, Karnatak University, 
Dharwar, India. (By title) 


Horvitz and Thompson (J.A.S.A., 1952), while discussing sampling with varying prob- 
ability and without replacement, have given three classes of estimators. If an empty class 
is that where the unbiased estimators independent of population values do not exist, it is 
shown that their 7-class is in general an empty class when sampling with varying prob- 
ability is adopted. However, when Midzuno’s system of sampling is adopted with replace- 
ment, such a class of estimators exists and has a minimum variance unbiased estimator in 
the class independent of the population values. The 7;-class, which is non-empty, has no 
minimum variance unbiased estimator independent of the population values even when 
simple random sampling is adopted. The non-empty 7>-class is known to have only one 
unbiased estimator and so is a minimum variance unbased estimator. It is noted that for 
Midzuno’s system of sampling with replacement 7>-class estimator has a smaller variance 
than that of the minimum-variance unbiased estimator of 7,-class. It is further noted for 
sampling with varying probability and with replacement that the minimum variance un- 
unbiased estimator in the over all class consisting of the classes 7, and 7: is the unbiased 
estimator in the 7. class. However, the relative efficiency of T:-class estimator and an esti- 
mator in 7;-class depends upon the probability system adopted. 


46. The Role of the Multivariate Edgeworth Series in the Random Walk Prob- 


lem. J. F. Price anp W. M. Strong, Boeing Airplane Co., Seattle, Wash- 
ington, AND J. D. WHEELOCK, Oregon State University. 


For the random walk in N-space denote the vth step by the random vector 
Pv = (COS Gin , COS Gov , *** , COS YN») Where Cos ge» (K = 1, 2, --- , N) is a direction cosine. 
The point z, attained after n steps from the origin is then the resultant z, = p: + p2 + 
-++ +p, . The vectors p, are independent (and identically distributed) so that z = lim,+« Zn 
follows the N-dimensional normal distribution. The paper formally expresses the proba- 
bility density function of z, as a multivariate Edgeworth series and deduces therefrom, for 
N = 2, the asymptotic “modified Pearson’’ series discussed by Greenwood and Durand 
(Ann. Math. Stat., 26: 233-246 and 28: 978-986) for the distance r, from the origin. This 
approach has the advantage of using known moments (or cumulants) with maximum ef- 
ficiency, requiring only those moments necessary in the determination of the polynomial 
coefficients of a given power of 1/n, in contrast to the Laguerre series approach previously 
employed. A table of probabilities P(R < r,) for n = 4, 5, --- , 24 was constructed by 
quadrature from the known exact distribution for N = 3, and is presented for comparison 
with results from the modified Pearson series. 


47. On a Mathematical Model for Poliomyelitis Vaccine Effectiveness (Pre- 
liminary report). Dana QuapE, Communicable Disease Center, Atlanta, 
Georgia. 


A linear relationship between the logarithm of the attack rate of paralytic poliomyelitis 
and the number of doses of killed-virus vaccine received, which “‘indicates that each suc- 
cessive dose reduced the remainder of the susceptibles by the same proportion as did the 
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first dose’’, was discovered by Dr. Jonas Salk (The Lancet, October 1, 1960, pp. 715-723). 
This model is made explicit and various mathematical and statistical problems which it 
entails are considered. 


48. The Distribution of the Ratio of the Variances of Variate Differences in 
the Circular Case. J. N. K. Rao anp G. Trntnkr, Iowa State University. 


In time series analysis, the variate difference method is used to test the order of the 
finite difference at which the trend or the systematic part in the time series is approximately 
eliminated. There is no exact test available in the literature except for the one proposed by 
Tintner (‘‘The variate difference method,’’ Bloomington, Indiana, 1940) based on a method 
of selection which uses only a portion of the observations. In this paper, the statistic 
Vi+i/Ve is proposed to test that the trend is approximately eliminated at the kth finite 
differencing of the series where V; is the variance of the series of the kth differences. Its 
exact distribution assuming that the observations are NIJ (0, o”) is derived under a circular 
definition of the universe. The lower 5% and 1% points of the statistics V2/V; and V;/V2 
are tabulated for various values of N, the size of the sample. In practice, one uses the non- 
circular statistic with these percentage points for the circular statistic as an approximation, 
especially with long time series. 


49a. Estimation of Failure Rates of Systems in Development. Davin Rusin- 
sTEIN, General Electric Company. 


Given an m X ~ matrix (C;;) of populations of components with the corresponding 
matrices: (A;;) of failure rates, (7';;) of test times, (X;;) of the number of failures, (a;;) of 
acceptance numbers. X;; are independent Poisson random variables with parameter 
\i,Tij-@ij are nonnegative integers or «. Components from population CT with failure 
rate At will be used in the system if C} = Ci; where X;; S ai; andj < j’ for any j’ for 
which Xu s ai; 5: 


Let 8;; = Oif Xi; > aij , 1 if Xi; S ai; . Let Ky; = Of Xi; > ai; + 1, Xi; /Ti; if Xi; S 
nij + 1. Under rather general conditions, or 7 hij Tit (1 — 8%) is an unbiased 
estimate of the system failure rate of 5-7: AT in the sense that difference of the two ran- 
dom variables has the expected value zero. Let o7; = Oif Xi; > ai; + 2, Xi; (Xi; — 1)/T?; 
if Xi; = ai; + 2, Xi;/Ti if Xj S ney +1. DH PL a; TJ (1 — bx) is an unbiased 
estimate of 


ti=—1 j=l k=l i=1 


e[E Ealfa-w- Le] 


49b. Determining Bound on Expected Values of Certain Functions. Bernarp 
Harris, University of Nebraska. 


This extends some results given by the author in ‘‘Determining Bounds on Integrals 
with Applications to Cataloging Problems” (Ann. Math. Stat. Vol. 30, 1959). Let g(x) be a 
continuous function, not linearly dependent on the first k monomials, whose first k deriva- 
tives exist and are monotonic; ~ , wz, --- , we are known constants and the first k moments 
of an unknown distribution function F(z). The sup(inf) E{g(z)} is computed, the sup(inf) 
being taken over all distribution functions, whose first k moments are given by mu, ue, 


- , ue . The extremal distributions are characterized, and computed explicitly for k < 3. 
In addition, some applications are given. 
, 


50. Convergence to Normality of Functions of a Normal Random Variable. 
Norman C. Severo anp Lioyp J. Monrzinco, University of Buffalo. 
(By title) 


The asymptotic distributions of functions of a normal random variable are investigated 
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as some function of the parameters tends to a limit. It is assumed that the functions of the 
normal variate will be defined in such a way as to be real for all values of the variate. In 
particular, if Y is a normal random variable with mean yu, and standard deviation o, , and if 
X = Y?, where p > 0, has mean yz, and standard deviation o, , then X is asymptotically 
normally distributed with mean yw, and standard deviation o, as 7 = yuz/oz, > ~. If p < 0, 
X is asymptotically normally distributed with mean yu; (1 + O(n-?)] and standard deviation 
(puy /n){1 + O(n-*)]*. Sufficient conditions on a function h are given for the transformed 
variate X = h(Y) to be asymptotically normal with mean h(u,) and standard deviation 
h’ (uy )oy as n> &. 

It is shown that, for a large class of transformations h, the variate X = h(Y) is asymp- 
totically normal as ¢, — 0 with mean y, and standard deviation o, providing they exist. If 
they do not exist, the asymptotic mean and standard deviation are h(u,) + O(¢3) and 
h’ (wy )oy[l + O(o;)}'. The condition h’(u,) # 0 is shown to be necessary for convergence tu 
normality. Furthermore, the asymptotic distribution of h(Y) is characterized when the 
first m derivatives of h, at u, , are 0. 


51. A Probability Model for Couple Fertility. S. N. Srineu, University of Cali- 
fornia, Berkeley. 


A probability distribution for the number of conceptions to a couple (a male and a fe- 
male leading a married life), during a given time interval 7’, is derived on the assumptions: 
(a) the probability of a virtual conception in a unit of time is p, independently of virtual 
conceptions in any other units of time, where 7’ is assumed to contain T units of time. The 
probability of conception in the first unit of time is p. (b) if there is a conception in a cer- 
tain unit of time, then there is no conception during next A units of time. A is constant. 
(c) a couple belongs to one of two mutually exclusive groups A and B during time 7. Group 
A consists of sterile couples and couples who choose to be so, group B consists of couples 


not belonging to group A. The group B is homogeneous in the sense that any couple of 
group B has the same p, the probability of conception in a unit of time. Estimates of param- 
eters are based on sample mean and zero cell frequency. The asymptotic variances of the 
estimates are derived. The distribution has been applied to two examples given by Dande- 
kar (Sankhya, Vol 15 (1955), pp. 237-250). 


52. A Method for Computing the Cumulative Distribution Function of the 
Product of Two Dependent /-Variables. Rosepiru SircrReaves, Teachers 
College, Columbia University. 


We suppose we have three variables, y; , y2 , and a, distributed independently of each 
other; the first two are normal with zero means and unit variances, and the third has a 
chi-square distribution with n degrees of freedom. The marginal distribution of each of the 
variables t; = y:(n/a)* and t. = y2(n/a)}, is thus Student’s ¢t-distribution with n degrees of 
freedom, but the two t-variables are not independent. In some problems we are interested 
in computing the cumulative distribution function of the product tt. . An integral repre- 
sentation is found for the probability that this product is less than a specified value. This 
integral can be evaluated relatively easily by numerical integration to any desired accuracy. 


53. A Monte Carlo Analysis of the Serial Correlation Coefficient. Joun S. 
Wuirte, General Motors Research Labs. 


Let (z:) be a discrete stochastic process satisfying the auto regressive equation z; = 
aX:_1 + u, where the u’s are NID (0, 1). The limiting distribution of @, the MLE for a, is 
known (J. S. White, Ann. Math. Stat., Vol. 30 (1959), 831-834) except when a = +1. In 
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this paper the results of a Monte Carlo analysis of the distribution of & is given. Samples 
of size n = 10, 20, 50, 100, and 500 were drawn from populations having a = —2., —1.1, 
—1.0, —.9, —.5, 0, .5, .9, 1.0, 1.1 and 2. 


54. Distribution Function for Randomized Factorial Experiments. 8. Zacks, 
The Technion, Israel Institute of Technology Anp 8. EHRENFELD, New 
York University. 


In a previous paper on Randomization and Factorial Experiments [S. Ehrenfeld and 
S. Zacks, These Annals, Vol. 32 (1961), pp. 270-297] two randomization procedures, for choos- 
ing fractional replications, were studied. These procedures have been designed to yield 
information on a subgroup of preassigned parameters. Schemes of the analyses of variance, 
associated with each of the proposed randomization procedures, were also given. The ob- 
jective of the present paper is to study the distribution functions of the associated test 
statistics, and to establish procedures for the determination of test criterions for given 
levels of significance, as well as the power of the tests. 

The distribution functions of the test statistics, for testing the significance of the chosen 
parameters, depend on the nuisance parameters (those which do not belong to the preas- 
signed subgroup) in a manner that is determined by the randomization procedure. Since 
the experimenter generally lacks detailed information on the nuisance parameters, the 
problem is to appraise the sensitivity of the test functions (criterions) to variations in the 
nuisance parameters. 

It is shown that the effect of the nuisance parameters on the distribution function of the 
test statistics is through statistics of non-centrality, analogous to the parameters of non- 
centrality of the F-statistics in the non-randomized case. The low order moments of the 
statistics of noncentrality are studied, and the distribution functions of the test statistics 
are approximated by linear contrasts of double non-central F-distributions multiplied by 
the central moments of the statistics of non-centrality. 


(Abstract not connected with any meeting of the Institute.) 


1. Some Property of a Sequence of Random Events. Marek Fisz, University of 
Warsaw, Poland and University of Washington. (By title) 


As the author is aware, the following simple theorem has never been published. Denote 
by An(n = 1, 2, ---) a sequence of random events, B = MAn, pa = P(An), 
v, = P(Anai| Ai .-- A,). Assume that 0 < GP. <1,0<9.<1 (ne = 1, 2, ---). Then P(B) >0 
if and only if (*) a vn < *. It is known that if both of the relations >>; pa = © and 
(**) > v, = © hold, then P(lim, sup A,) = 1. If, however, (*+#*) > Pn < © and (#-) 
hold, then by virtue of the Borel-Cantelli Lemma, P (lim, sup A,) = 0 while the probability 
of occurrence of at least one of the A, is positive. If the A, are independent, relations (*) 
and (***) are equivalent and the author’s theorem asserts P(B) > 0 while the Borel-Cantelli 
Lemma asserts the weaker relation P (lim, sup A,) = 0. 


CORRECTION TO ABSTRACT 
“MOMENTS OF THE RADIAL ERROR” 
By Ernest M. ScHEUVER 


The following corrections should be made in the above-titled abstract (Ann. Math. 
Stat., Vol. 32 (1961), p. 638). Replace sentences two and three by the following: 

Let oj = $f (O11 + O22) + [(O1 — 22)? + 4o%2]*}, 03 = ${(01 +022) — [(O1 — O22)? + 4o%2}*}, 
k? = (0; — o2)/o;. Then the moments about the origin In of the radial error R = [zi + 23}! 
are, = 2iofT [4(n + 2)|F(—4n, 4,1; k?) where F (a, b, c; z) is the hypergeometric function. 





NEWS AND NOTICES 
Readers are invited to submit to the Secretary of the Institute news items of interest 


Personal Items 


Professor Frances E. Baker is returning to Vassar College (as of September, 
1961) after a year’s leave of absence spent at the University of North Carolina. 

Dr. Freidhelm Eicker has accepted a position at the Institiit fiir Augewandte 
Mathematik, Freiburg, Germany, beginning in the Fall of 1961. During the past 
academic year he has been visiting at the Department of Statistics, University of 
North Carolina, and spent the summer at Stanford University. 

Henry W. Gould has received a promotion from instructor to assistant pro- 
fessor of mathematics at West Virginia University. 

Dr. C. W. J. Granger has returned to his post as Lecturer in Statistics at the 
University of Nottingham, England after a year at the Econometric Research 
Program, Princeton University. 

W. J. Hall has been appointed Associate Professor in the Department of 
Statistics at The University of North Carolina in Chapel Hill. He will continue 
part-time teaching in the Department of Experimental Statistics at North 
Carolina State College in Raleigh. 

Bruce M. Hill has received his Ph.D. Degree in Statistics from Stanford Uni- 
versity. The topic of his dissertation was ‘“‘A Test of Linearity Versus Convexity 
of a Median Regression Curve.” 

Eugene H. Lehman, Jr., Instructor at Purdue Statistical Laboratory, com- 
pleted his Ph.D. in Experimental Statistics with a Mathematics minor at North 
Carolina State College, March 28, 1961. His dissertation was in the area of re- 
liability, life testing and the Weibull distribution. Upon passage of his final 
examination, he was elevated to Assistant Professor. 

Leone Y. Low received her Ph.D. in statistics from Oklahoma State Uni- 
versity in May, 1961 and has been an instructor at the University of Illinois for 
the past year. 

V. E. Palmour has accepted a position as Operations Analyst with the Opera- 
tions Evaluation Group, Massachusetts Institute of Technology. 


eR 


New Members 


The following persons have been elected to membership in the Institute 


Allen, Terrence M., Ph.D. Psychology (Purdue University); Assistant Professor, Psychol- 
ogy Department and Traffic Safety Center, Michigan State University; 700 W. Grand 
River Ave,. East Lansing, Mich. 

Atiquilah, A. M., Ph.D. (University of Dacca, Pakistan); Lecturer, Department of Sta- 
tistics, University of Dacca, Pakistan; Department of Mathematics, Birkbeck College, 
University of London, Malet Street, London, W. C. 1, England. 
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Bartko, John J., M.S. Statistics (Virginia Polytechnic Institute); Student, Department of 
Statistics, Virginia Polytechnic Institute, Blacksburg, Virginia. 

Becker, Oliver G., B.S. Military Engineering (U.S. Military Academy); Communication 
System Engineer, 177 Communication Systems, Inc., Garden State Plaza, Paramus, N. J. 

Bhattacharya, Prodyot K., D.Phil. Se. (University of Calcutta); Research Associate, 
Department of Statistics, University of North Carolina, Chapel Hill, N.C. 

Bhuchongkul, Subha, M.A. (University of California, Berkeley); Student, University of 
California; Department of Statistics, University of California, Berkeley 4, California. 

Billimoria, Adi R., B.S. (University of Bombay); Actuarial Clerk, The U.S. Life Insurance 
Co. in the City of New York, 125 Maiden Lane, New York 38, N. Y.; 740 West End 
Avenue, New York 25, N.Y. 

Blowney, David P., M.S. Physics (University of Wisconsin); Engineer, Sperry Gyroscope 
Company, Great Neck, N. Y.; 14 Matthews Street, Huntington Station, N.Y. 

Bjork, Lawrence A., B.A. (Northwestern University); Graduate Student, University of 
Chicago, Department of Statistics; 1739 E. 67 St., Chicago 49, Ill. 

Borgman, Leon E., M.S. (University of Houston); Graduate Student, Statistics Department, 
University of California, Berkeley 4, California. 

Brugger, Richard Michael, B.S. (University of Illinois); Graduate Student, University of 
Illinois, Urbana, Illinois; 4615 N. Claremont Ave., Chicago 25, Ill. 

Burr, Edmund John, M.S. (University of Queensland) ; Lecturer, University of New England, 
Armidale, N. S. W., Australia. 

Chan, Carl C. Y., B.A. (Oklahoma Baptist University); Graduate Student, University of 
California, Berkeley 4, California; 2731 Durant Ave., Berkeley 4, California. 

Chang, I-Chen, Ph.D. Civil Engineering (Illinois Institute of Technology); Stress Analyst, 
Scientific Design Company, 2 Park Ave., New York, N. Y.; Apt. 6-A, 434 W. 120th 
Street, New York, 27, N. Y. 

Cohn, Jacob L., A.B. Mathematics (Hunter College); Associate Analyst, Analytic Services, 
Inc., 5202 Leesburg Pike, Alexandria, Va.; 415 Orleans Circle, SW Vienna, Va. 

Comer, John P., Jr., M.S. Electrical Engineering (M.I.T.); Statistical Consultant, The 
Procter and Gamble Co., I[vorydale Technical Center, Cincinnati 17, Ohio; 88 Morning- 
side Dr., New York 27, N. Y. 

Cote, Roger William, M.S. (Michigan State University); Graduate Student, Department of 
Statistics, Michigan State University, East Lansing, Michigan. 

Creech, F. Reid, B.S. (University of Wyoming); Research Assistant, Department of Sta- 
tistics, University of Wyoming, Laramie, Wyoming; 86614 North Fifth St., Laramie, 
Wyoming. 

Daniel, Klaus H., Diploma-Mathematiker (University of Goettingen) Research Assistant, 
Teaching Assistant, University of California, Berkeley, California. 

Davis, Stephen A., B.A. (University of Chicago); Student, University of Chicago, 5801 S. 
Ellis Ave., Chicago, Ill.; 5517 S. Everett Ave., Chicago 37, Ill. 

Denzell, George E., B.S. Physics (University of Washington); Teaching Assistant, Math 
Department, University of Washington; 3968 Union By Circle, Seattle 5, Washington. 

Eaton, John Henry, B.S. (University of Alabama); Student, New York University, Wash- 
ington Square, N. Y.; 155 Ridge St., New York 2,N.Y. 

Eaton, Morris L., Undergraduate Student, University of Washington; 133 Collins Road, 
Kelso, Washington. 

Eisenberg, Herbert B., M.S. (Iowa State University); Member, Mathematics Research 
Staff, System Development Corp., 2500 Colorado Ave., Santa Monica, Calif. 

Flehinger, Betty J., Ph.D. (Columbia); Research Mathematician, International Business 
Machines Corp., Watson Laboratory, 612 West 116 St., New York 27, N. Y. 

Fox, Bennett L., A.B. (University of Michigan); Graduate Student, University of Chicago; 
7823 S. Luella, Chicago 49, Ill. 

Fridshal, Donald, M.S. (New York University); Graduate Assistant, Department of In- 
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dustrial Engineering and Operations Research, College of Engineering, New York 
University, University Heights, Bronx, N. Y.; 67-32 Juno St., Forest Hills 75, N.Y. 

Friedman, Robert, B.A. (Hunter College); Student, New York University, Graduate 
School of Arts and Sciences, New York, N. Y.; 147-26 68 Avenue, Flushing 67, N. Y. 

Ghosh, Sakti Pada, M.Sc. (Calcutta University); Student, Department of Statistics, Uni- 
versity of California, Berkeley, California. 

Grandillo, Anthony D., M.A. Arts and Sciences (Western Reserve University) ; Staff Statis- 
tician, Industrial Rayon Corp., West 98th and Walford, Cleveland, Ohio; 899 Clarence 
Road, Cleveland Hts. 21, Ohio. 

Hall, Robert P., B.S. Mathematics (St. John’s University); Student, New York University, 
School of Arts and Science, 5 Washington Square North, New York; 115-02 Sutter Ave., 
Ozone Park 20, N. Y. 

Hammood, Abdul H. Yahya, B.Sc. Mathematics (Baghdad University); Student, Univer- 
sity of North Carolina, Department of Statistics, Chapel Hill, N.C. 

Heddinger, Richard W., Student, University of Pennsylvania; 231 Prospect Ave., Clifton 
Heights, Penna. 

Hedetniemi, Charles J., Jr., A.B. (University of Michigan); Pvt. E-1, U.S. Army; 307 
Chestnut Street, Falls Church, Virginia. 

Heinzelmann, Waldemar Emil, B.S. (Roanoke College); Graduate Student, Department of 
Statistics, Virginia Polytechnic Institute, Blacksburg, Virginia. 

Henry, Neil W., M.A. (Dartmouth College); Student, Columbia University; Bureau of 
Applied Social Research, 605 W. 115 St., New York 25, N.Y. 

Hickman, James C., Ph.D. (University of Iowa); Instructor, Dept. of Mathematics, State 
University of Iowa, Iowa City, Towa. 

Hinkelmann, Klaus, Diplom-Mathematiker (Universitat Hamburg); Graduate Assistant, 
Dept. of Statistics, Iowa State University, Ames, Iowa. 

Holgate, Philip, B.Sc. (University of London); Statistician, Rothamsted Experimental 
Station, Harpenden, Herts, England; ‘‘High Canons’’, Well End, Barnet, Herts, England. 

Hucke, Dorothy M., M.S. (Columbia University) ; Statistician, Biological Control Labora- 
tories, Charles Pfizer and Co. Inc., 630 Flushing Ave., Brooklyn 5, New York; 82-44 
63rd Ave., Regas Park 79, New York. 

Hurd, Guthbert C., Ph.D. Mathematics (University of Illinois); Director, Control Systems, 
International Business Machines Corp., 590 Madison Ave., New York, N. Y.; Monterey 
and Cottle Roads, San Jose, California. 

Kailath, Thomas, 8.M. (M.I.T.); Research and Graduate Student, Room 26-344, M.I.T., 
Cambridge, 29, Mass. 

Keilin, Joseph E., B.A. (George Washington University); Associate Mathematician, Johns 
Hopkins Applied Physics Laboratory, Johns Hopkins Road, Howard County, Md. 

Keilson, Julian, Ph.D., (Harvard); Staff Consultant, Applied Research Laboratory, Sylvania 
Electronic Systems Division, Waltham, Mass. 

Khatri, Chinubhai Ghelabhai, Ph.D. Statistics (University of Baroda); Lecturer in Sta- 
tistics, Faculty of Science, M.S. University of Baroda (India); Haribhakti Building, 
Salatwada, Baroda, India. 

Kim, Dong Hyun, B.S. Mathematics (Washburn University of Topeka) ; Graduate Research 
Assistant, Department of Statistics, Calvin Hall, Kansas State University, Manhattan, 
Kansas. 

Kinias, Plato D., B.A. Mathematics (Ripon College); Student, Department of Mathe- 
matics, University of Illinois; 507 Bash St., Champaign, IIl. 

Koo, Delia Wei, Ph.D. (Radcliffe College); Graduate Assistant, Department of Statistics, 
Michigan State University; 4473 Maumee Drive, Okemos, Michigan. 

Kraft, Gerald, M.A. (Harvard University); Senior Associate, United Research Inc., 808 
Memorial Drive, Cambridge 39, Mass. 
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Kumar, Satindar, M.A. (Punjab University); Research Assistant, Department of Statistics, 
University of California, Berkeley 4, Calif. 

Kutner, Michael H., B.S. (Central Connecticut State College); Graduate Assistant, Vir- 
ginia Polytechnic Institute; Dept. of Statistics, V. P. I., Blacksburg, Va. 

Maitra, Ashok Prasad, M.Sc. (University of Bombay); Research Teaching Assistant, De- 
partment of Statistics, 501 Campbell Hall, University of California, Berkeley 4, Calif. 
Mazuy, Kay Knight, A.B. (Univ. of North Carolina); Jr. Statistician, Research Triangle, 

Institute, Box 490, Durham, N. C. 

Mehra, Krishen L., M.A. (Punjab University) ; Research Assistant, Department of Statistics, 
University of California, Berkeley 4, Calif.; (note: on leave from the position of Lecturer, 
Punjab University, Chandigarh, India). 

Moore, Felix E., B.A. Mathematics (University of Washington); Professor and Chairman, 
Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, 
Michigan. 

Murphy, Robert E., M.S. Mathematics (Texas Technological College); Mathematician, 
El Paso Natural Gas Company, Research and Development, P.O. Box 1492, El Paso, Texas. 

Murty, P. V. R., M.Sc (Andhra University, Waltair, India); Research Asst., Dept. of Sta- 
tistics, University of California, Berkeley 4, California. 

O’Brien, Thomas D., B.E.E. (The City College of N. Y.); Instructor in Electrical Engineer- 
ing, Manhattan College of N. Y., Riverdale, N. Y.; 845 Walton Av., Bronx 61, N. Y. 

Pisani, José Furtado, Licenciado (Universidade de Sao Paulo); Professor de Estatistica, 
Faculdade de Filosofia, Ciencias e Letras de Rio Claro, Departamento de Estatistica, 
Caixa Postal, 178, Rio Claro, Séo Paulo, Brasil. 

Pollak, Edward, M.S. (North Carolina State College); Graduate Assistant, Department of 
Mathematical Statistics, Columbia University, New York 27, N. Y.; 1878 Harrison 
Avenue, Bronx 58, N. Y. 

Preston, Lester W., Jr., B.S. (M.I.T.); Research Fellow, Department of Biostatistics, 
School of Public Health, University of North Carolina, Chapel Hill, N. C.; 2933 Clare- 
mont Road, Raleigh, N. C. 

Raghavarao, Damaraju, M.A. (University of Magpur); Govt. of India Research Training 
Fellow, Department of Statistics, University of Bombay, Bombay 1, India. 

Raghunadanan, Kunhunni, M.S. (University of Kerala); Graduate Student, University 
of Wyoming, Laramie, Wyoming; Department of Statistics, University of Wyoming, 
Laramie, Wyoming. 

Ramakumar, R., M.Sc. (Gujarat University, Ahmedabad, India); Student, Department of 
Statistics, Stanford University, Stanford, California. 

Ringer, Larry J., B.S. (Iowa State University); Graduate Assistant, Statistics Department 
Iowa State University, Ames, Iowa. 

Rosenberg, Lloyd H., M.S. (Iowa State University); Research Assistant, Columbia Univer- 
sity, New York, 27, N. Y.; 98-20 62 Drive, Rego Park 74, N. Y. 

Sampford, Michael R., D.Phil (Oxford); Principal Scientific Officer, Agricultural Research 
Council of G. B., Unit of Statistics, University of Aberdeen, Meston Walk, Old Aberdeen, 
Scotland. 

Samuel, Ester, M.A. (Columbia University); Research Assistant, Columbia University, 
New York 27, N. Y.; 290 West End Ave., Apt. 1 D, New York 23,N.Y. 

Schwartz, Richard E., B.A. (University of Chicago); Student, Department of Mathematics, 
Cornell University, Ithaca, N. Y.; Mathematician, General Electric Corp., Ithaca, N. Y. 

Seo, Kenzo, M., M.S. (Purdue University); Graduate Student, Department of Mathe- 
matics and Statistics, Purdue University, West Lafayette, Indiana; 27-12 Ross Ade 
Drive, West Lafayette, Indiana. 

Shapiro, Samuel S., M.S. (Columbia University); Graduate Student, Rutgers Univ.; 
212 Phillips Road, New Brunswick, N. J. 
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Sheehan, Daniel M., B.S. Mathematics (College of William and Mary); Graduate Assistant, 
Virginia Polytechnic Institute; 405 Marlington St., Blacksburg, Va. 

Sherman, Robert E., B.A. (University of Minnesota); Student, University of Minnesota, 
Minneapolis 14, Minn.; 4143 Blaisdell Ave., Minneapolis 9, Minn. 

Shimi, Ismail N., B.Sc. (A’in-Shams University, Cairo, Egypt); Instructor, Department 
of Applied Math., Faculty of Science, A’in-Shams University, Cairo, Egypt, U.A.R.; 
Department of Statistics, University of North Carolina, Chapel Hill, N.C. 

Snyder, Mitchell, A.B. (Yeshiva University); Fellow, Institute of Mathematical Sciences, 
New York University, 25 Waverly Place, N. Y.3, N. Y.; 2501 Amsterdam Ave., New York 
i. = 

Spencer, Gary S., B.S. (Kansas State University); Graduate Research Assistant, Depart- 
ment of Statistics, Kansas State University, Manhattan, Kansas. 

Subrahamaniam, K., M.Sc. (Banaras Hindu University) ; Graduate Student and University 
Fellow, Columbia University; Department of Mathematical Statistics, Columbia Uni- 
versity, New York 27, N. Y. 

Toller, Louis, Ph.D. (Duke University); Head, Dept. of Mathematics and Physics, Alma 
College, Alma, Michigan. 

Usher, William M., M.S. (Oklahoma State University); Operations Research Manager, 
Texas Instruments Incorporated, Semiconductor Division, P.O. Box 5012, Dallas 22, 
Texas; 332 Polk St., Apt. D., Richardson, Texas. 

Wolock, Fred W., M.S. (Catholic University); Graduate Assistant, Department of Statis- 
tics, Michigan State University, East Lansing, Mich.; 18 Lake Terrace, Whitinsville, 
Mass. 

Yahav, Aharon Joseph, M.Sc. (Stanford University); Student, University of California; 
Department of Statistics, University of California, Berkeley 4, California. 


SE 


DOCTORAL DISSERTATIONS IN STATISTICS, 1960 


Listed below are doctorates conferred during the year 1960 in the United 
States for which the dissertations were written on topics in statistics or related 
fields. The university, major subject, and the title of the dissertation are given 


in each case. Readers are invited to notify the Editor of any omissions from the 
list. 


Sidney, Addelman, Iowa State University, major in statistics, ‘‘Fractional Factorial 
Plans.”’ 

Anita K. Bahn, Johns Hopkins University, major in biostatistics, ‘‘Psychiatric Clinic 
Outpatients.”’ 

Stuart A. Bessler, Stanford University, major in statistics, “Theory and Applications of 
the Sequential Design of Experiments, k-Actions and Infinitely Many Experiments.” 

John Joseph Birch, University of California, Berkeley, major in statistics, ‘‘Approxima- 
tions for the Entropy for Functions of Markov Chains.”’ 

Neeti Ranjan Bohidar, Iowa State University, major in statistics, ‘“Role of Sexlinked 
Genes in Quantitative Inheritance.’ 

Roshan L. Chaddha, Virginia Polytechnic Institute, major in statistics, “Some Problems 
in Inventory Control.’ 

Shrishti Dhar Chatterji, Michigan State University, major in statistics, ‘“Martingales of 
Banach-Valued Random Variables.” 

Charles Christenson, Harvard University, major in business administration, ‘‘Strategic 
aspects of competitive bidding for corporate debt security.”’ 
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Robert Francis Cogburn, University of California, Berkeley, major in statistics, ‘‘Asympto- 
tic Properties of Stationary Sequences.’’ 

Theodore Colton, Johns Hopkins University, major in biostatistics, ‘Optimum Multi- 
Stage Screening Plans.”’ 

Martin Robert Dorff, lowa State University, major in statistics, ‘‘Large and Small Sample 
Properties of Estimators for a Linear Functional Relation.’’ 

Satya Deva Dubey, Michigan State University, major in statistics, ‘“Contributions to 
Statistical Theory of Life Testing and Reliability.”’ 

James R. Duffett, Virginia Polytechnic Institute, major in statistics, ‘‘System Reliability 
from Component Reliabilities.’’ 

Bob E. Ellison, University of Chicago, major in statistics, ‘‘A Multivariate k-Population 
Classification Problem.”’ 

Raymond I. Fields, Virginia Polytechnic Institute, major in statistics, ‘‘Estimation with 
Samples Drawn from Different but Parametrically Related Distributions.”’ 

Myron B. Fiering, Harvard University, major in engineering, ‘‘Statistical Analysis of 
Stream Flow Data.”’ 

Franklin Marvin Fisher, Harvard University, major in economics, ‘‘A Priori Information 
and Time Series Analysis.” 

David Freedman, Princeton University, major in statistics, ‘‘Mixtures of Stochastic 
Processes.”’ 

Marshall Leonard Freimer, Harvard University, major in mathematics, ‘‘Truncated Poli- 
cies in Dynamic Programming.”’ 

David William Gaylor, North Carolina State College, major in experimental statistics, 
‘*The Construction and Evaluation of Some Designs for the Estimation of Parameters 
in Random Models.”’ 

(Mrs.) Ruth Z. Gold, Columbia University, major in mathematical statistics, ‘Inference 
about Markov chains with non-stationary transition probabilities.” 

C. Jackson Grayson, Jr., Harvard University, major in business, ‘‘Decisions under Uncer- 
tainty: Drilling Decisions by Oil and Gas Operators.”’ 

James Ennis Grizzle, North Carolina State College, major in experimental statistics, 
‘‘Application of the Logistic Model to Analyzing Categorical Data.’’ 

William Leroy Hafley, North Carolina State College, major in experimental statistics, 
‘‘Some Comparisons of Sensitivities for Two Methods of Measurement.’’ 

William Leonard Harkness, Michigan State University, major in statistics, ‘‘An Investiga- 
tion of the Power Function for the Test of Independence in 2 X 2 Contingency Tables.”’ 

James E. Jackson, Virginia Polytechnic Institute, major in statistics, ‘‘Multivariate 
Sequential Procedures for Testing Means.’’ 

Shriniwas Keshavarao Katti, lowa State University, major in statistics, ‘‘Some Aspects of 
Statistical Inference for Contagious Distributions.”’ 

Jerome Hamilton Klotz, University of California, Berkeley, major in statistics, ‘‘Non- 
parametric Tests for Scale.”’ 

Samuel Kotz, Cornell University, major in mathematics, ‘‘Exponential Bounds for the 
Probability of Error in Discrete Memoryless Channels.”’ 

Joseph H. Kullback, Stanford University, major in statistics, ‘‘A Quality Control Model 
for Complex Items.”’ 

Harold Joseph Larson, Iowa State University, major in statistics, ‘‘Sequential Model 
Building for Prediction in Regression Analysis.”’ 

Paul R. Lohnes, Harvard University, major in education, ‘‘A Comparison of Test Space 
and Discriminant Space Classification Models.”’ 

Theodore K. Matthes, Columbia University, major in mathematical statistics, “Two-stage 
Sampling Procedures.”’ 

Edmund B. McCue, Carnegie Institute of Technology, major in mathematics, ‘‘Power 
Characteristics of the Control Chart for Number of Defects, No Standard Given.’’ 
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Donald F. Morrison, Virginia Polytechnic Institute, major in statistics, ‘‘Life Distribution 
and Reliability of a System with Spare Components.”’ 

Burton Torres Onate, Iowa State University, major in statistics, ‘“Development of Multi 
stage Designs for Statistical Surveys in the Philippines.”’ 

S. Parameswaran, University of Illinois, major in analysis, ‘“Some Theorems on the Growth 
of Partition Functions.’’ 

E. M. Paul, University of Illinois, major in analysis, ‘‘Density in the Light of Probability 
Theory.”’ 

Edward B. Perrin, Stanford University, major in statistics, ‘‘Estimation of Parameters in 
Systems Related to the Observation by an Unknown Monotone Transformation.”’ 
William E. Pruitt, Stanford University, major in statistics, ‘‘Bilateral Birth and Death 

Processes.”’ 

Dana Edward Anthony Quade, University of North Carolina, major in statistics, ‘““The 
Asymptotic Power of the Kolmogorov Tests of Goodness of Fit.”’ 

Wyman Richardson, University of Nerth Carolina, major in statistics, ‘‘Asymptotic 
Methods of Evaluating f, f(x)dz.’’ 

Harry Myer Rosenblatt, George Washington University, major in statistics, ‘‘Multivariate 
Experimental Designs.”’ 

Alan Ross, Iowa State University, major in statistics, “On Two Problems in Sampling 
Theory: Unbiased Ratio Estimators and Variance Estimates in Optimum Sampling 
Designs.”’ 

Arthur Schleifer, Harvard University, major in business administration, ‘‘Use of Bayes’ 
Decision Theory in Quality Control.”’ 

Lorraine Schwartz, University of California, Berkeley, major in statistics, ‘‘Consistency 
of Bayes’ Procedures.”’ 

Edney Webb Stacy, University of North Carolina, major in statistics, ‘‘An Estimate of 
Correlation Corrected for Attenuation and its Distribution.’’ 

(Mrs.) Charlotte T. Striebel, University of California, Berkeley, major in statistics, “Effi 


cient Estimation of Regression Parameters for Certain Second Order Stationary 


’ 


Processes.’ 

Shashikala B. Sukhatme, Michigan State University, major in statistics, ‘‘Asymptotic 
Theory of Some Nonparametric Tests.’’ 

M. Sudigdomarto, University of Illinois, major in analysis, ‘‘A Representation Theory for 
the Laplace Transform of Vector-Valued Functions.”’ 

Thomas Neil Throckmorton, Iowa State University, major in statistics, “Structures of 
Classification Data.”’ 

Leo J. Tick, Columbia University, major in mathematical statistics, ‘“‘Contributions to the 
Theory and Applications of Stationary Random Processes in Fluid Mechanics.’’ 

M. T. Wasan, University of Illinois, major in statistics, ‘“Sequential Inference.’’ 

John Thomas Webster, North Carolina State College, major in experimental statistics, 
*‘A Decision Procedure for the Inclusion of an Independent Variate in a Linear Esti- 
mator.”’ 

Thomas A. Willke, Ohio State University, major in statistics, ‘‘A Class of Multivariate 
Rank Statisties.”’ 

David M. G. Wishart, Princeton University, major in statistics, ‘Augmentation Techniques 
in the Theory of Queues.”’ 

William Max Woods, Stanford University, major in statistics, ‘“Variables Sampling Inspec- 
tion Procedures which Guarantee Acceptance of Perfectly Screened Lots.’’ 

Joe Curtis Woosley, University of Michigan, major in pubiic health statistics, ‘A Study of 
Repeated Hospital Admissions among Michigan Blue Cross Members.”’ 

Nils Donald Ylvisaker, Stanford University, major in statistics, ‘‘On Time Series Analysis 
and Reproducing Kernel Spaces.’’ 
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Note: Reports of the President and the Editor for 1961 will appear in the December issue. 
REPORT OF THE SECRETARY FOR 1961 


During the year the Institute has held its 86th and 87th meetings at Cornell 
University in Ithaca, N. Y. and at the University of Washington, in Seattle, 
Washington, respectively. A Business Meeting was held during the 87th (24th 
Annual) meeting to make the necessary decisions for carrying on the work of 
the Institute for the coming year. 

The Institute is grateful for the efforts of Mrs. Dorothy M. Gilford, Program 
Coordinator. Her help and guidance in this capacity has been invaluable in 
setting up meaningful programs for the 1961 meetings. She was ably assisted 
by the local Program Chairmen, Robert Bechhofer and David L. Wallace and 
tneir committees. Joan Rosenblatt and Fred C. Andrews performed their duties 
as Associate Secretaries meticulously and competently. The Assistant Secre- 
taries, Isadore Blumen and Robert Tate should be mentioned for their part in 
making satisfactory arrangements for their meetings. 

G. E. NicHouson, Jr. 


EE 


REPORT OF THE TREASURER FOR 1961 


This is my first report as Treasurer. Unfortunately the Annual Meeting this 
year comes before we are even half-way through the fiscal year so that we have 
little experience with the new financial policies upon which to forecast the 
financial outcome for 1961. Last year’s activities have been certified by our 
auditors—Stebbins, Dillon, Curry and Harrington—and the Balance Sheet of 
December 31, 1960 is included in this report. 

The Revenue and Expenditure statement for the year ending December 31, 
1960 is attached, and I have also included statements of 1958 and 1959 for com- 
parative purposes. Dues revenue, subscription revenue and income from our 
investments all continued the same gentle rise. The sharp drop in net income 
from the sale of back issues is a continuation of the trend resulting from the later 
issues being sold at practically cost. 

The large increase in expense for 1960 occurred in the catchall category of 
miscellaneous printing, etc., in the editorial expense and in travel. The miscel- 
laneous category consists chiefly of expenses incurred by the Secretary’s Office 
in communications with the membership. The miscellaneous office expense 
was also larger than usual because of audit expense. In the past the Institute 
had been employing the services of a faculty colleague to conduct a minimum 
audit; but it is now felt worthwhile to have a more thorough audit performed 
by an outside firm. The audit also covered an eighteen months period from 
January 1, 1959 to June 30, 1960. The editorial expense for 1960 is inflated 
because we follow a cash system of accounting and part of 1959 expenses were 
not reimbursed until early in 1960. The travel expense was higher than usual 
because it was necessary for two of the officers to have their expenses paid to the 


Annual Meeting, and the past Treasurer made a trip in connection with the 
financial innovations. 
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Note: Reports of the President and the Editor for 1961 will appear in the December issue. 
REPORT OF THE SECRETARY FOR 1961 


During the year the Institute has held its 86th and 87th meetings at Cornell 
University in Ithaca, N. Y. and at the University of Washington, in Seattle, 
Washington, respectively. A Business Meeting was held during the 87th (24th 
Annual) meeting to make the necessary decisions for carrying on the work of 
the Institute for the coming year. 

The Institute is grateful for the efforts of Mrs. Dorothy M. Gilford, Program 
Coordinator. Her help and guidance in this capacity has been invaluable in 
setting up meaningful programs for the 1961 meetings. She was ably assisted 
by the local Program Chairmen, Robert Bechhofer and David L. Wallace and 
their committees. Joan Rosenblatt and Fred C. Andrews performed their duties 
as Associate Secretaries meticulously and competently. The Assistant Secre- 
taries, Isadore Blumen and Robert Tate should be mentioned for their part in 
making satisfactory arrangements for their meetings. 

G. E. Nicuouson, Jr. 
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REPORT OF THE TREASURER FOR 1961 


This is my first report as Treasurer. Unfortunately the Annual Meeting this 
year comes before we are even half-way through the fiscal year so that we have 
little experience with the new financial policies upon which to forecast the 
financial outcome for 1961. Last year’s activities have been certified by our 
auditors—Stebbins, Dillon, Curry and Harrington—and the Balance Sheet of 


December 31, 1960 is included in this report. 

The Revenue and Expenditure statement for the year ending December 31, 
1960 is attached, and I have also included statements of 1958 and 1959 for com- 
parative purposes. Dues revenue, subscription revenue and income from our 
investments all continued the same gentle rise. The sharp drop in net income 
from the sale of back issues is a continuation of the trend resulting from the later 
issues being sold at practically cost. 

The large increase in expense for 1960 occurred in the catchall category of 
miscellaneous printing, etc., in the editorial expense and in travel. The miscel- 
laneous category consists chiefly of expenses incurred by the Secretary’s Office 
in communications with the membership. The miscellaneous office expense 
was also larger than usual because of audit expense. In the past the Institute 
had been employing the services of a faculty colleague to conduct a minimum 
audit; but it is now felt worthwhile to have a more thorough audit performed 
by an outside firm. The audit also covered an eighteen months period from 
January 1, 1959 to June 30, 1960. The editorial expense for 1960 is inflated 
because we follow a cash system of accounting and part of 1959 expenses were 
not reimbursed until early in 1960. The travel expense was higher than usual 
because it was necessary for two of the officers to have their expenses paid to the 


Annual Meeting, and the past Treasurer made a trip in connection with the 
financial innovations. 
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The 1961 budget suggested here varies only to a small degree from the one 
presented at the last Annual Meeting. It is too early in the year to make refined 
predictions on revenue from membership and subscription. Even though the 
subscription rate was raised, cancellations were not unusually heavy. Based 
on last year’s experience with the sale of back issues the estimate of revenue 
from this source was lowered by $1,000. 

As this is written, we are receiving the first returns from the inauguration of 
page charges in the Annals. The page charges are $15. per page, which is less 
than half the cost. Dr. Bowker, in his last Treasurer’s Report, estimated that 
the Institute might realize $7,000. revenue the first year of page charges. In 


the absence of any evidence to the contrary, 75% of this amount is being pro- 


jected since we will bill for only the first three numbers of the current volume 
(billing for the December 1961 issue of the Annals will take place in 1962). 

In a reappraisal of expenses for 1961 the original estimate for the Annals 
has been left standing even though the first issue this year cost over $9,000. 
The estimates for salaries and editorial expenses have been raised slightly; the 
latter increase was to compensate for an additional $400. (over approved $4,800.) 
incurred in 1960. Dr. Kruskal, the outgoing editor, has proposed an overlap of 
editorial assistance with the incoming editor, Dr. Hodges, but my budget makes 
no provision for this pending Council decision. 

We may reverse the indicated trend this year and come out in the black—or 
almost so. In 1961 we have raised subscription rates, the charge for the later 
back issues, and instituted page charges. It is too early in the year to forecast 
the results, but I shall submit a revised budget in the Fall. 

GERALD J. LIEBERMAN 


INSTITUTE OF MATHEMATICAL STATISTICS 
Balance Sheet 
December 31, 1960 
Assets 
Current Assets: 


Cash in bank—checking account 
Cash in banks—savings accounts 
Investments: 
U.S. Gov’t bonds—at cost $8 ,857 .25 
Savings certificate 5,000.00 
Total investments 
Accounts receivable: 
Dues 
Subscriptions 
Back issues of Annals 
Total receivables 
Inventory of Annals 


Total assets $129 ,066.50 
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Liabilities and Surplus 
Liabilities 


Accounts payable $11,780.13 
Payroll taxes payable 179.02 
Dues advanced by members 3,679.00 
Advanced subscriptions , 000. 50 
Wald royalties payable 96.40 
Grant from National Science Foundation to 

subsidize operations—Note 1 , 058. 
Grant from National Science Foundation to 

subsidize publication of Index—Note 2 2 ,536.¢ 

Total liabilities 


Surplus 


Reserve for life members 
Available for maintaining supply of Annals 
issued 
Available for general purposes 
Total surplus 76,737. 


Total liabilities and surplus $129, 066. 


INSTITUTE OF MATHEMATICAL STATISTICS 
Statement of Changes in Surplus 
For the Year Ended December 31, 1960 


Surplus, January 1, 1960 
Additions: 

Grant received from National Science 
Foundation to subsidize adjusted oper- 
ating loss for the year ended December 
31, 1960 

Total 
Reductions: 

Operating loss for the year ended December 
31, 1960 $8 ,390.81 

Provision for cost of reprinting Annals issued 
in 1960 


Total reductions 11,117.81 


Surplus, December 31, 1960 $38 , 393.98 

Note 1. The National Science Foundation advanced $18 ,000.00 to subsidize publications 
and to investigate cost reduction for the three-year period beginning 1958. Operating losses 
for the years 1958, 1959 and 1960 totaled $13,955.55. The provision for the cost of reprinting 
Annals for the same period totaled $8,199.00. 

Note 2. In 1960 the National Science Foundation granted the Institute $22,550.00 to 
cover the cost of compiling an index for the Annals volumes one throuyh thirty. Under the 
terms of the grant, all receipts from sales of indexes during the three year period beginning 
with the date of publication, in excess of $6,000.00 must be returned to the Foundation 
until the advance has been repaid. 
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REPORT OF THE ITHACA, NEW YORK MEETING OF THE INSTITUTE 
OF MATHEMATICAL STATISTICS 


The eighty-sixth meeting of the Institute of Mathematical Statistics was 
held at Cornell University on April 20-22, 1961, in conjunction with the Biometric 
Society (ENAR), and the Biometric Section, the Section on Physical and Engi- 
neering Sciences, and the Section on Social Statistics of the American Statistical 
Association. 

These meetings were held at this time to honor Walter F. Willcox on his 
hundredth birthday. Professor Emeritus Willcox is a past Chief Statistician of 
the Bureau of the Census, a past President of the American Statistical Associ- 
ation, and is an Honorary President of the International Statistical Institute. 
A special session in recognition of his interests, arranged by the Social Statistics 
Section of the ASA, was held Friday evening, April 21. A reception and party 
in honor of Professor Willcox were held on Thursday evening, April 20. 

By invitation of the Committee on Special Invited Papers, the Institute was 
addressed by Professor D. Dugué of the Sorbonne and Catholic University, on 
“Random Integration: Results and Problems,” Thursday, April 20, at 1:20 
p.m. 

There were 260 persons present at the meetings, including 167 members of the 
Institute of Mathematical Statistics. The program of the meeting was as follows: 
Some sessions were held under joint sponsorship with other societies; where 
this is indicated, the session was arranged by the society listed first. 


THURSDAY, April 20, 1961 
9:00-10:30 a.m.—Limit Theorems 


Chairman: Jack Kierer, Cornell University 
1. “Limit Theorems,’’ Harotp Bereastr6M, Chalmers University of Technology, Géte- 
borg, and Catholic University. 
2. “Properties of the Ergodic Limit Function,’’ Rarar. V. Cuacon, Cornell University. 
3. Limit Theorems of Stochastic Processes,’’ Gisrro Maruyama, Kyusyu University 
and Columbia University. 


9:00-10:30 a.m.—Contributed Papers I 


Chairman: Acneson J. Duncan, The Johns Hopkins University 
1. “On the Higher Moments of Linear Estimates Based on Multistage Samples from a Finite 
Population,” J. C. Koop, North Carolina State College. 
2. ‘‘The Moments of the Non-Central t-Distribution,’ D. Hocpen, R. 8. PINKHAM, AND 
M. B. WiLx; Rutgers, The State University. 
. “Tables of Minimum Functions for Generating Galois Fields GF(p"),’’ J. D. ALANEN, 
Case Institute of Technology (Introduced by I. M. Chakravarti). 
. “On Simultaneous Tests in Nested Designs” (Preliminary Report), P. R. KrisHnatag, 
Remington Rand UNIVAC. 
. “Some Notes on the Investigation of Heterogeneity in Interactions,” N. L. JoHNsON, 
Case Institute of Technology. 
. “Fractional Factorial 2” and 3" Designs With and Without Blocks, Preserving the Main 
Effects and the Two Factor Interactions,’’ M. 8. Pate, University of North Carolina 
and Research Triangle Institute. 
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10:35-12:00 noon—Sampling (BS and IMS) 


Chairman: Puriip J. McCarruy, Cornell University 
1. ‘‘Analytic Studies of Sample Surveys,” H. O. Hartiey, Iowa State University. 
2. “Sampling with Unequal Probabilities,” J. N. K. Rao, Iowa State University. 
3. ‘Measurement Errors in Censuses and Surveys,’’ Morris H. Hansen, WiiiiaM N. 
Hurwitz anp Max A. Brersuap, Bureau of the Census. (Read by Leon Pritzker, 
Bureau of the Census) 


11:00-12:00 noon—Time Series 


Chairman: JeRomE Sacks, Cornell University 
1. “Regression in Stationary Time Series,” Mapuav P. Hesie, Columbia University. 
2. “Series Representations of Gaussian Processes,’’ N. DoNaALD YLVISAKER, New York 
University. 


1:20-2:10 p.m.—Special Invited Paper 
Chairman: Leo Katz, Michigan State University 


‘*Random Integration: Results and Problems,’’ Danint Dueut, The Sorbonne and 
Catholic University. 


2:20-4:00 p.m.—Ranking and Selection Procedures I (IMS and BS) 


Chairman: Rosert BecuHorer, Cornell University 
1. ‘‘Selecting the Largest of k Normal Population Means,’ CHartes W. Dunnett, Lederle 
Laboratories. 
2. ‘“‘A Sequential Procedure for Comparing Several Experimental Categories with a Con- 
trol,’’ Epwarp Pautson, Cornell University. 
3. ‘‘Bayes Selection Procedures with Linear Loss Structure,’’ Howarp Ratrra, Harvard 
University. 


2:20-4:00 p.m.—Contributed Papers II 


Chairman: Max Hatperin, General Electric—Schenectady 
1. “Use of a Priori Knowledge in the Estimation of Means from Double Samples,’’ 8. K. 
Kart, Florida State University. 
2. ‘‘A Posteriori Distributions in the Translation Parameter Case” (Preliminary Report), 
MartTIn Fox anp HERMAN Rustin, Michigan State University. 
3. ‘Some Theory and Techniques for Robust Estimation’’ (Preliminary Report), ALLAN 
BrrnBaum, New York University. 
. “Central Limit Theorem for Sums over Sets of Random Variables,’’ FrizDHELM EIckER, 
University of North Carolina. 
. “On the Limiting Distribution of —2 log \ in the Non-regular Case,’ Donaup A. JoNnEs, 
University of Michigan. 
. “A Double-Ended Queuing Process,” Samurt M. GiveEn, Northeastern University 
(Introduced by Lionel Weiss). 


4:10-5:50 p.m.—Ranking and Selection Procedures II (IMS and BS) 


Chairman: M1.LtTon Soseg., University of Minnesota 
1. ‘“‘Bayes Rules for the Problem of Choosing the Largest Mean,” Ricuarp P. Buanp, 
University of North Carolina, anp Davip B. Duncan, John Hopkins University. 
2. “On Selecting a Subset Containing the Population with the Smallest Variance,” SHANTI 
S. Gupta, Bell Telephone Laboratories, Allentown, Pa. 
3. “Some Sequential Tests for Symmetric Problems,” Wm. Jackson Hatu, University of 
North Carolina. 





NEWS AND NOTICES 


FRIDAY, April 21, 1961 
8:15-10:00 a.m.—Probability (General) 


Chairman: Doueuas Rosson, Cornell University 
1. ‘Finite Cascades,’’ Peter Ney, Cornell University 
2. ‘‘Properties of a Transient Queue,’”’ WARREN M. Htirscu, New York University. 
3. “‘Amalgamation of Ordered Weighted Means,’’ Corin L. Mattows, Columbia Univer 
sity. 
“‘Absorption Time in a Markov Chain,’ Grorrrey A. WaTTERSON, Australian Na- 
tional University and Virginia Polytechnic Institute. 


9:00-10:00 a.m.—Sequential Procedures in Medicine (BS and IMS) 


Chairman: 8. Getsser, National Institutes of Health 
1. ‘Sequential Allocation of Patients in Clinical Trials,’’ Ropert J. TAYLOR AND HER- 
BERT A. Davin, Virginia Polytechnic Institute. 
2. ‘‘An Infinity of Closed Sequential Plans,’ Marvin A. SCHNEIDERMAN, National Insti- 
tutes of Health. 


10:30-12:00 noon—Probability (General) 


Chairman: Louis J. Core, Purdue University 
1. “Some Remarks on the Renewal Theorem,” V. E. Benes, Bell Telephone Laboratories, 
Murray Hill, N. J. 
2. ‘The Renewal Density Theorem,’’ WALTER L. Smitu, University of North Carolina. 
3. “The Two-Sided Absorption Problem for Stable Processes,’ Harotp W1pom, Cornell 
University. 


10:30-12:00 noon—Contributed Papers III 


Chairman: INGRAM OLKIN, University of Minnesota 
1. ‘‘Null Distribution and Bahadur Efficiency of the Hodges Bivariate Sign Test” (Prelimi- 
nary Report), A. Jorre AND JEROME Kiotz, McGill University. 
“Power of Mood’s and Massey's Tests against Exponential and Rectangular Alternatives,’’ 
F.C. Leone, lL. M. CHakRavarti, AND J. D. ALANEN, Case Institute of Technology. 
‘‘Asymptotic Relative Efficiency of Mood’s and Massey’s Tests Against Some Parametric 
Alternatives,’’ 1. M. CHakravarti, F. C. Leong, anp J. D. ALANEN, Case Institute 
of Technology. 
. “Power of a Non-Parametric Test of Independence,’ Recina C. Evanpt, Case Insti- 
tute of Technology (introduced by N. L. Johnson). 
“Selection of the Best Treatment in a Paired-Comparison Experiment,’ B. J. TRAWINSKI 
AND H. A. Davin, Virginia Polytechnic Institute. 
‘4 Multivariate Analogue of One-Sided Test” (Preliminary Report), Axro Kupo, 
University of Michigan. 
. “On Some Properties of Compositions of anIntegerand Their Application to Probability 
Theory,’ T. V. NARAYANA AND S. G. Monanty, University of Alberta. 


1:20-2:10 p.m.—Invited Paper 


Chairman: JoHn TuKeEy, Princeton University 
*‘Some Thoughts on Statistical Inference,’ EGon S. Pearson, University College, 
London, and Princeton University. 


2:20-4:10 p.m.—-Statistics (General) (IMS and BS) 


Chairman: IsaporE BLUMEN, Cornell University 
1. ‘‘Power of the Likelihood-Ratio Test of the General Linear Hypothesis in Multivariate 
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Analysis,’ Harry O. Posten, IBM Research Center, anp RoL¥F BARGMANN, 
Virginia Polytechnic Institute. 
‘‘Asymptotic Means and Variances in k Dimensions ,’’ Davip R. BRILLINGER, Princeton 
University. 
3. ‘Inference on Treatment Effects and Design of Experiments in Relation to Such Infer- 
ence,’’ S. N. Roy anp J. N. Srrvastava, University of North Carolina. 
4. “‘A Sequential Test of Fit,’ Lione. Weiss, Cornell University. 


2:20-4:00 p.m.—Contributed Papers IV 


Chairman: WALTER T. Feperer, Cornell University 
1. ‘“‘A Generalization of a Simple Test Function for Guarantee Time Associated with the 
Exponential Failure Law,’’ Satya D. DusBegy, Proctor and Gamble Co. 

‘‘Probability Plots for the Gamma Distribution,’’ M. B. WiLk, R. GNANADESIKAN, 
anD M. J. Huyert, Bell Telephone Laboratories, Murray Hill, N. J. 

“The Use of Sample Ranges in Setting Exact Confidence Bounds for the Standard Devia- 
tion of a Rectangular Population,’ H. Leon Harter, Wright-Patterson Air Force 
Base. 

‘Iterated Steepest Ascent on Ellipsoidal Contours,’ R. J. BurHier, B. V. SHan, AND 
O. KeMPpTHORNE, Iowa State University. 

‘The Method of Parallel Tangents (PARTAN) for Finding an Optimum,” B. V. Suan, 
R. J. BUEHLER, AND O. KempTuorne, Iowa State University. 

“‘Steepest Ascent PARTAN on Ellipsoidal Contours,’ R. J. Burenver, B. V. SHan, anp 
O. KEMPTHORNE, Iowa State University. 


4:20-5:40 p.m.—-Statistics (General) (IMS and BS) 


Chairman: Leon Hersacn, New York University 
1. ‘Foundations of Statistics,’’ ALLAN BrrNBAUM, New York University. 
2. ‘‘Admissible and Inadmissible Minimax Estimates,’’ RoGer H. Farre.yi, Cornell 
University. 
3. ‘‘Bayesian Robustness,’’ MeRvyN STONE, Princeton University. 


4:20-5:40 p.m.—Bioassay (BS and IMS) 


Chairman: Irwin D. J. Bross, Roswell Park Memorial Institute 
1. ‘Asymptotic Power of Tests of Linear Hypotheses Using the Probit and Logistic 
Models,” James E. Grizz, University of North Carolina. 
2. ‘‘Some Bioassay Techniques for the Determination of Minute Residues,’ Joun Gur- 
LAND, University of Wisconsin. 
3. ‘‘Some Characteristics of the Spearman-Karber Estimator in Bioassay,’’ Byron W. 
Brown, University of Minnesota. 


8:00 p.m.—The First 60 Years of the Census Bureau (ASA, IMS, and BS) 


Chairman: Rospert W. Burcess, Director, Bureau of the Census 


SESSION IN HONOR OF WALTER F. WILLCOX 


1. “Impact of Research and Development on Census Methods in the 20th Century,’’ Morris 
H. Hansen, Wiitur1am N. Hurwitz, anp Josepn F. Daty, Bureau of the Census. 

2. ‘‘Developments in the Analysis and Use of Census Data: 1900-1960,’’ ConRAD TAEUBER, 
Bureau of the Census. Remarks: WALTER F. Wiiicox, Cornell University. 


9:00-9:50 a.m.—Invited Paper 


Chairman: RosepitH SitGrReaves, Columbia University 
“Optimal Experimental Designs,’’ Jack Kierer, Cornell University. 
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10:00-10:50 a.m.—Reliability (IMS and BS) 


Chairman: Ctype Kramer, Virginia Polytechnic Institute 
1. ‘‘Testing to Establish a High Degree of Safety or Reliability,’ Francis J. ANSCOMBE, 
Princeton University and Bell Telephone Laboratories. 
2. “Improving Mean Life by Eliminating Items with Short Lives,’ W1tu1amM T. WELLS AND 
Grorrrey 8. Watson, University of Toronto and Research Triangle Institute. 


10:00—-10:50 a.m.—Information Theory (IMS and BS) 


Chairman: EstHer Se1pEn, Michigan State University 
1. “On the Construction of Bose-Chaudhuri Matrices with the Help of Abelian Group 
Characters,’’ DomintquE C. Foata, University of North Carolina. 


2. ‘‘Algorithms for Zero-Error Optimum Codes,’’ D. K. Ray-Cuaupuuri, University of 
North Carolina. 


11:00-12:00 noon—Group Screening (IMS and BS) 
Chairman: Ropert J. Haver, University of North Carolina 
1. ‘Group Testing to Eliminate Efficiently all Defectives in a Binomial Sample,’’ Mitton 
SoBE.L, University of Minnesota. 


2. “Group Screening Designs,’ Grorrrey 8. Watson, University of Toronto and Re- 
search Triangle Institute. 


Contributed Papers Presented by Title 


1. ‘‘Extreme Values in Gaussian Sequences,’’ SIMEON BERMAN, Columbia University. 
2. ‘On the Foundations of Statistical Inference, II’’ (Preliminary Report), ALLAN Brrn- 
BAUM, New York University. 

. “On the Foundations of Statistical Inference, III,’? ALLAN BrrNBAUM, New York Uni- 
versity. 

. “Several-sided Kolmogoroff-Smirnoff Procedures,’’ Herspert T. Davin, Iowa State 
University. 

. “A Simple Text Function for Guarantee Time Associated with the Exponential Failure 
Law”’ (Preliminary Report), Satya D. Dusgy, Proctor and Gamble Co. 

. “Asymptotically Most Efficient Single Observation Estimator of Expected Life for 
Exponential Failure Law’’ (Preliminary Report), Satya D. Dusgy, Proctor and 
Gamble Co. 

. “Location and Scale Parameters in Exponential Families of Distributions’’ (Prelimi- 
nary Report), T. 8S. Ferauson, University of California, Los Angeles. 

. ‘Nonparametric Methods for Additive Effects,’’ J. L. Hopces, Jr. anp E. L. LEHMANN, 
University of California, Berkeley. 

. “On the Expected Value and Variance of a Ratio Estimate,’’ J. C. Koop, North Carolina 
State College. 

. “A Bayes Surveillance Procedure,’’ Joan E. NYLANDER, Boeing Airplane Company. 

. “On the Performance of the Group-Screening Method,’ M. 8. Pate, Research Triangle 
Institute. 

. “On the Distribution of First Significant Digits,’ Roapr 8. Pinkuam, Rutgers—The 
State University. 

. “On the Theory of Univariate Successive Sampling,’ S. G. PRaBHu AJGAONKAR AND 
B. D. Trxxrwat, Karnatak University, Dharwar, India. 

. ‘Asymptotic Relative Efficiency of Mood’s Test for Two-Way Classification,’ Y. S. 
Satue, University of North Carolina. 

. “Estimation of Parameters of the Gamma Distribution Using Order Statistics,’ M. B. 
Wik, R. GNANADESIKAN, AND M. 8. Huyetr, Bell Telephone Laboratories, Mur- 
ray Hill, N. J. 
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REPORT OF THE SEATTLE, WASHINGTON MEETING OF THE 
INSTITUTE OF MATHEMATICAL STATISTICS 


The eighty-seventh meeting of the Institute of Mathematical Statistics and 
the twenty-fourth annual meeting was held at the University of Washington, 
Seattle, Washington, on June 14-17, in conjunction with the meetings of the 
American Statistical Association (Section on Physical and Engineering Sciences), 
the Biometric Society (WNAR), the Institute of Management Sciences, the 
American Mathematical Society, and the Mathematical Association of America. 

By invitation of the Committee on Special Invited Papers, the Institute was 
addressed by Professor Tore Dalenius of the University of Stockholm and 
Catholic University on ‘‘Recent Developments in Sample Theory and Method,” 
and by Professor Marek Fisz of the University of Warsaw and the University 
of Washington on “Infinitely Divisible Distributions: Recent Results and 
Applications.”’ The Wald Lecture was delivered by Professor Charles Stein of 
Stanford University in three parts: Estimation of Many Parameters, I, II, and 
III, ‘Estimation of Many Means,” ‘Approximation of Prior Measures by 
Probability Measures,” and ‘Some Remarks on the Foundations of Statistics,” 
respectively. The Rietz Lecture was delivered by Professor David Blackwell 
of the University of California on ‘““Dynamic Programming.” 

There were 200 members of the Institute registered for the meeting. The 
program of the meeting was as follows: 


WEDNESDAY, JUNE 14, 1961 
8:30 a.m.—lInvited Papers on Problems of Design 


Chairman: D. L. Wauuace, University of Chicago and Center for Advanced Study in the 
Behavioral Sciences. 
1. ‘‘The Method of Parallel Tangents for Finding an Optimum,’’ Ropert J. BUEHLER, B. 
V. SHau, anv O. Kemptuornge, Iowa State University. 
2. ‘Optimal Accelerated Life Designs for Estimation and Testing,’’ HpeRMAN CHERNOFF, 
Stanford University. 
3. ‘‘Minimal Designs,’’ Jack Krerer, Cornell University. 


8:30 a.m.—Invited Papers on Reliability Theory (ASA-SPES and IMS) 


Chairman: Larry Hunter, General Telephone and Electronics Laboratories 
1. ‘‘Redundancy Models,’’ FRaNK PRoscHAN, Boeing Scientific Research Laboratories. 
2. ‘‘Semi-Markov Processes and Their Applications to Multi-State Time-Dependent Phe- 
nomena,’’ GEORGE WeEtss, Institute for Fluid Dynamics and Applied Mathematics, 
University of Maryland. 
3. “Estimating Mean Life from Grouped Data Under the Exponential Assumption,” 
Jack NaDLErR, Bell Telephone Laboratories. 


10:20 a.m.—Contributed Papers I 


Chairman: HerMaNn Rusin, Michigan State University 
1. ‘“‘The Role of the Multivariate Edgeworth Series in the Random Walk Problem,”’ J. F. 
Price, Boeing Airplane Co.; W. M. Stone, Boeing Airplane Co.; ann J. D. Waesr- 
Lock, Oregon State University. 





NEWS AND NOTICES 


‘Determining Bounds on Expected Values of Certain Functions,’’ BERNARD Harris, 
University of Nebraska. 

“Evaluation and Design of Multiple Choice Questionnaires,’ (Preliminary Report), 
H. Cuernorr, Stanford University. 

‘Some Tests for Outliers,’ C. P. QUESENBERRY, Montana State College anp H. A. 
Davin, Virginia Polytechnic Institute. 

“On a Mathematical Model for Poliomyelitis Vaccine Effectiveness,’ (Preliminary 
Report), Dana QuapE, Communicable Disease Center, Atlanta, Georgia. 

6. “‘A Probability Model for Couple Fertility,’ 8. N. Sincu, University of California. 


10:20 a.m.—lInvited Papers on Nonparametric Statistics (ASA-SPES and IMS) 


Chairman: Ronatp Pykg, University of Washington 

1. “On the Use of a Distribution-Free Property in Determining a Transformation of One 
Variate so that it Will Exceed Another with a Given Probability,’’ Sam C. SAUNDERS, 
Mathematics Research Center, U. S. Army, University of Wisconsin and Boeing 
Airplane Co. 

‘‘Non-Parametric Methods for the Detection of Signals in Noise,’’ Jack Capon, Federal 

Scientific Corporation. 

3. ‘‘A Comprehensive Coverage of the Non-Parametric Field,’ Joun E. Wausu, System 
Development Corporation. 


2:10 p.m.—lInvited Papers on Stochastic Processes 
Chairman: EMANUEL ParzEN, Stanford University 
1. ‘‘Branching Processes and Semigroups of Operators,’ A. T. BuHarucna-Reip, Uni- 
versity of Oregon. 
2. ‘‘Inference in Stochastic Processes,’’ HERMAN RuBIN, Michigan State University. 


2:10 p.m.—lInvited Papers on Sequential Analysis (ASA-SPES and IMS) 


Chairman: HerMAN CHERNOFF, Stanford University 
1. ‘Uncertainty, Information, and Sequential Experiments,’ M. H. DeGroot, Carnegie 
Institute of Technology and University of California, Los Angeles. 
2. ‘“‘Asymptotic Shapes: Sequential Testing of Composite Hypotheses,’’ GIDEON SCHWARZ, 
Stanford University. 
3. ‘Bayes Procedures,’ James A. LecHNER, Westinghouse Electric Corp 


4:00 p.m.—1961 Council Meeting 


7:30 p.m.—Invited Papers on Informal Inferential Procedures (IMS and ASA- 
SPES and BS-WNAR) 


Chairman: A. M. Moon, C-E-I-R, Inc., Los Angeles, California 
1. ‘The Future of Data Analysis,’’ J. W. Tukey, Princeton University and Bell Tele- 
phone Laboratories, Murray Hill, N. J. 
2. ‘‘Some Sequences of Fractional Replicates,’’ C. DaniEL, N.Y. 
3. “Graphical Methods for Internal Comparisons in Multi-Response Experiments,’’ M. 
B. WILK AND R. GNANADESIKAN, Bell Telephone Laboratories, Murray Hill, N.J. 
Discussants: ALLAN BrrnBaum, New York University 
A. P. Dempster, Harvard University 


THURSDAY, JUNE 15, 1961 
8:30 a.m.—Contributed Papers II 


Chairman: Bos E.uison, Lockheed Aircraft Research Laboratories 
1. ‘“‘Tables for the Reliability of Repairable Systems with Time Constraints,’’ (Preliminary 
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Report), Roquez BEJARANO AND Rona.p 8. Dick, International Electric Corpora- 
tion, Paramus, N. J. 

‘Length of the Longest Run of Consecutive Successes,’’ E. J. Burr, University of New 
England, Armidale, N.S.W., Australia, and Stanford University. 

‘Oddities in Estimating the Scale Parameter of the Weibull Distribution,’’ Evcene H. 
LEHMAN, JR., Purdue University. 

“On the Jitina Sequential Tolerance Limits Procedure,’ Sam C. SauNpDERS, Army 
Research Center, University of Wisconsin and Boeing Airplane Co. 

“Estimation of Failure Rates of Systems in Development,’ Davip RUBINSTEIN, General 
Electric Company, Ithaca, N. Y. 

‘“‘An Optimal Sequential Accelerated Life Test,’ Stuart A. Bressier, General Tele- 
phone and Electronic Laboratories, Inc., Menlo Park, California; HerMAN CHER- 
NOFF, Stanford University; Aanp ALBERT W. Marsnwa.., Stanford University and 
Institute for Defense Analysis, Princeton, N. J. 

‘An Optimal Sequential Accelerated Life Test with Exponential Dependence on Stress,”’ 
GIDEON ScHWARzZ, Stanford University. 


8:30 a.m.—Invited Papers on Queueing (ASA-SPES and IMS) 


Chairman: RicHarp BaRLow, Institute for Defense Analysis, Princeton, N. J. 


1. ‘‘A Generalization of the Ballot Problem and its Application in the Theory of Queues,”’ 
Lasos TakAcs, Columbia University. 

2. ‘‘Application of Cohen’s Derived Markov Chains,’”’ R. Sysx1, University of Maryland. 

3. “Queues Subject to Service Interruptions,’ JuLIAN KeErLson, Sylvania Electric Prod- 
ucts 


10:20 a.m.—lInvited Papers on Miscellaneous Topics 


Chairman: Pau, SOMMERVILLE, C-E-I-R, Inc., Los Angeles, California 
1. ‘‘Tchebycheff Inequalities in a Generalized Moment-Problem,’’ C. L. Mauiows, Bell 
Telephone Laboratories, Murray Hill, N. J. 
2. ‘‘The Measurement of Coherence,’’ N. R. GoopMan, Space Technology Laboratories, 
Inc. 


9 


3. “Classification without Known-Class Data,’’ Davin Wauuace, University of Chicago 
and Center for Advanced Study in the Behavioral Sciences. 


10:20 a.m.—Invited Papers on Planning and Analysis of Experiments (ASA- 
SPES, IMS and BS-WNAR) 


Chairman: MARVIN ZELEN, University of Maryland and National Bureau of Standards 

1. ‘‘The Consideration of Variance and Bias Errors in the Selection of a Response Surface 
Design,’ G.E.P. Box, University of Wisconsin anp NormMan R. Draper, Army 
Research Center, University of Wisconsin. 

2. “Orthogonal Main-Effect Plans,’’ SipNey ADDELMAN, Research Triangle Institute. 

3. ‘‘Asymmetric Factorial Designs and the Direct Product,’ B. Kurxs1an, Diamond 
Ordnance Fuze Laboratories anp M. ZELEN, University of Maryland and National 
Bureau of Standards. 


1:00 p.m.—Wald Lecture I 


Chairman: Leo Katz, Michigan State University 
‘Estimation of Many Parameters I: Estimation of Many Means’’? CHares STEIN, 
Stanford University 
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2:10 p.m.—-Invited Papers on Stochastic Models 


Chairman: D. A. 8. Fraser, University of Toronto 
1. ‘On a Stochastic Genetics Model of Moran,’’ Samuge. Kar in, Stanford University 
2. ‘‘Stochastic Models for Certain Pattern Recognition Schemes,’ G. P. Steck, Sandia 
Corporation. 


2:10 p.m.—Invited Papers on Industrial Statistics (ASA-SPES and IMS) 


Chairman: Grratp J. LiepermMan, Columbia University and Stanford University 
1. ‘“‘Bayes Sequential Rectification Sampling,’’ M. V. Jouns, Jr., Stanford University. 
2. “Bayes Test of ‘p <4’ versus ‘p 24’,’’ Stcertr MortGut1 anp HERBERT ROBBINS, 
Columbia University. 
3. ‘Stationary Sources and Information Rates,’’ N. DoNALD YLVISAKER, New York 
University. 


3:40 p.m.—Special Invited Paper 


Chairman: J. L. Hopcss, Jr., University of California, Berkeley 
“Recent Developments in Sample Survey Theory and Method,’’ Tore Darenivs, Uni- 
versity of Stockholm and Catholic University. 


4:50 p.m.—1961 Business Meeting 


9:00 p.m.—Party with American Mathematical Society, Sandpoint Naval Air 
Station—Officers’ Club 


FRIDAY, JUNE 16, 1961 
8:30 a.m.—Invited Papers on Estimation (IMS and BS-WNAR) 


Chairman: Leroy Fo.ks, Oklahoma State University 
1. ‘Combining Information in Incomplete Blocks,’? FRANKLIN A. GRAyYBILL, Colorado 
State University. 
2. ‘Estimation with Minimum Mean Square Error,’’ H. O. Hartiey, Iowa State Uni- 
versity. 
3. “Remarks on the Efficiency of Unbiased Estimation with Auxiliary Variates,’’ W. H 
Wiu.1aMs, Bell Telephone Laboratories, Murray Hill, N. J. 


8:30 a.m.—Information Theory (ASA-SPES and IMS) 


Chairman: ArAM THOMASIAN, University of California 

1. ‘“‘T'wo Output-One Input Zero Memory Channels,’’ D. BLACKWELL, University of Cali- 
fornia, Berkeley; L. Breiman, University of California, Los Angeles; aNp A. 
THomasiaNn, University of California, Berkeley. 

2. ‘‘Random Coding of a Continuous Source,’’ Joan L. Kevxy, Jr., Bell Telephone Labo- 
ratories, Murray Hill, N. J. 

. “The Ergodic Theorem of Information Theory,’ Suu-Ten C. Moy, Syracuse 

University. 


11:00 a.m.—Wald Lectures II 


Chairman: A. H. Bowker, Stanford University 


‘Estimation of Many Parameters, Il: Approximation of Prior Measures by Probability 
Measures,’ CHARLES STEIN, Stanford University. 
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1:00 p.m.—Institute of Mathematical Statistics—1962 Council Meeting 
1:00 p.m.—Contributed Papers III 


Chairman: B. J. Faruie, Bell Telephone Laboratories, Murray Hill, N. J. 

1. ‘‘A Monte Carlo Analysis of the Serial Correlation Coefficient,’’ Joan S. Wuite, General 
Motors Research Laboratories, Warren, Michigan. 

2. ‘Linear Hypotheses with Linear Restrictions,’? ANpR& G. LAURENT, Wayne State 
University, Detroit, Michigan. 

3. “‘Mutual Information and Mazimal Correlation as Measures of Dependence,’ C. B. 
BELL, San Diego State College. 

. ‘An Application of the Sequential Probability Ratio Test to Finite Populations,’ Pauu 
GuntTHER, Armour Research Foundation Technology Center, Chicago, Illinois. 

. “Subsamples and Order Statistics,’’ (Preliminary Report), J. T. Cau anp Kamau 
Ya’cousB, University of Pennsylvania. 

6. ‘Minimum Risk Estimation: A Nonparametric Case Involving Percentiles,’’ HERBERT 
B. EISENBERG AND JOHN E. Wats, System Development Corp., Santa Monica, 
California. 

. “Estimation of Location and Scale Parameters by Optimally Selected Observations,’’ 
Caru-Ertk SARNDAL, University of North Carolina. 

8. ‘‘The Distribution of the Ratio of the Variances of Variate Differences in the Circular 
Case,”’ J. N. K. Rao anp G. TintNER, Iowa State University. 

9. “On Pairwise Independence,’’ Seymour GEISSER AND NATHAN MANTLE, National 
Institutes of Health, Bethesda, Maryland. 

10. ‘On Some Methods of Estimation for the Logarithmic Series Distribution,’ G. P. 
PatiL, University of Michigan. 

11. “On a Necessary and Sufficient Condition for a Set of Jointly Normal Variables to 
have a Common Variance and a Common Covariance,’ B. R. Brat, University of 
California, Berkeley. 


1:00 p.m.—Invited Papers on Optimization Processes (TIMS and IMS) 


Chairman: RicHARD SINGLETON, Stanford Research Institute 
1. “Optimum Monitoring Procedures,’’ FRANK ProscHan, Boeing Scientific Research 
Laboratory; RicHarp BaR.Low, Institute for Defense Analysis; anp Larry Hunter, 
General Telephone and Electronic Laboratories. 


2. “On the Optimum Localization of Industries and Consumers,’’ EpMonp MALINvauD, 
University of California. 


3. ‘“‘Chance-Constrained Programming,’’? ABRAHAM CHARNES, Northwestern University 
AND WILLIAM W. Cooper, Carnegie Institute of Technology. 

4. “Relations Between Stationary and Dynamic Programming Analysis of Inventory 
Processes,’’ DONALD IGLEHART, Stanford University. 


3:40 p.m.—Special Invited Paper 
Chairman: Jack Krerer, Cornell University 
“Infinitely Divisible Distributions: Recent Results and Applications,’’ Marek Fisz, 
University of Warsaw and University of Washington. 
4:50 p.m.—Rietz Lecture 
Chairman: Ericu LEHMANN, University of California 


““Dynamic Programming,’’ Davip BLACKWELL, University of California 


7:30 p.m.—American Mathematical Society Banquet-Tyee Yacht Club 
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SATURDAY, JUNE 17, 1961 
8:30 a.m.—Invited Papers on Multivariate Normal Probabilities 


Chairman: E. L. Crow, National Bureau of Standards, Boulder, Colorado 
1. ‘Probability Content of Regions under Spherical Normal Distributions,’’ Haroup 
RvuBEN, University of Sheffield, Sheffield, England. 
2. ‘Evaluation of Multivariate Normal Integrals,’ S.S. Gupta, Bell Telephone Labora- 
tories, Allentown, Pennsylvania. 
Discussion: 
DoNnaLp OWEN, Sandia Corporation 
H. T. Davin, Iowa State University 


8:30 a.m.—Contributed Papers IV 


Chairman: Davip Wau.ace, University of Chicago and Center for Advanced Study in the 
Behavioral Sciences 
1. ‘‘Distribution Functions for Randomized Factorial Experiments,’ S. Zacks AND S 
EHRENFELD, The Technion, Israel Institute of Technology and New York Univer 
sity. 

‘‘4 Method for Computing the Cumulative Distribution Function of the Product of Two 
Dependent t-Variables,’’ RosepitH S1TGREAVES, Teachers College, Columbia Uni 
versity. 

“Some Distribution-Free Multiple Comparison Procedures in the Asymptotic Case,’ 
Peter Nemeny!, S.U.N.Y. College of Medicine at Brooklyn. 

““Three-Quarter Replicates of 2° Designs,’’ Peter W. M. Joun, California Research 
Corporation and University of California. 

‘“‘Non-Parametric Sequential Tests for the Two-Sample and Several-Sample Problems, 
R. R. M. GEoGHAGEN AND Joun E. Watsu, System Development Corp. 


10:00 a.m.—Wald Lectures III 


Chairman: A. T. James, Yale University 
‘‘Estimation of Many Parameters, I11: Some Remarks on the Foundations of Statistics,’’ 
CHARLES STEIN, Stanford University 


10:00 a.m.—Invited Papers on Inventory and Renewal Processes (TIMS 
and IMS) 


Chairman: Dona.p IGLEHART, Stanford University 
1. ‘‘Renewal Problems in Traffic,’? W1LL1AM JEWELL, University of California, Berkeley. 
2. ‘Superposition of Renewal Processes,’’ JouN NYLANDER, Boeing Airplane Company. 
3. “Routine and Non-routine Shipment Policies,’’ RicHarp SINGLETON, Stanford Re 
search Institute. 


11:10 a.m.—Contributed Papers V 


Chairman: Cares B. BELL, San Diego State College 

1. ‘‘Some Properties of a Large Set of Random Signals,’’ (Preliminary Report), NELSON 
M. BLacuMAN, Sylvania Electronic Defense Laboratories (Introduced by Emanuel 
Parzen). 

2. ‘Tests for Regression Coefficients When a Continuous Sample is Available,’’ M. M. 
Srpp1qu!, Boulder Laboratories, National Bureau of Standards, Boulder, Colorado. 

3. “Circular Probability Problems,’ Witt1am C. GuENTHER, The Martin Company and 

University of Wyoming. 
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4. ‘Moments of the Radial Error,’’ Ernest M. Scuever, Space Technology Laboratories, 
Ine. 


1:30 p.m.—Invited Papers on Limit Theorems 


Chairman: J. R. BLum, Sandia Corporation 


1. ‘Some Limit Theorems for Stochastic Processes under Mixing Conditions,’’ Davip L. 
Hanson, Sandia Corporation 


2. ‘‘Examples of Invariance Principles for Stochastic Processes,’ JouN LAMPERTI, Stan- 
ford University. 


2:00 p.m.—Contributed Papers VI 


CHAIRMAN: DONALD R. Truax, University of Oregon 
1. ‘On the Cumulants of a General Renewal Process,’ V. K. Murtuy, Stanford Uni- 
versity. 
2. ‘A Froperty of Least Squares Estimator in Regression Analysis when the Independent 
Variables are Stochastic,’ P. K. Buatrracnarya, University of North Carolina. 
3. “On Tests with Likelihood Ratio Criteria in Some Problems of Multivariate Analysis,’’ 


(Preliminary Report), N. C. Grrr, Stanford University, (Introduced by Charles 
Stein). 


4:30 p.m.—Tea-Faculty Club 
Contributed Papers Presented by Title: 


1. “On Separating a Deterministic Component from a Stochastic Sequence,’’ FR1EDHELM 
Ercker, University of North Carolina. 
9 


2. ‘‘Some Properties of Compositions of an Integer and Their Application to Probability 


and Statistics,’ T. V. NARAYANA AND 8. G. Monanty, University of Alberta. 

3. “The Distribution of Probabilities in a Stochastic Learning Model,’ (Preliminary 
Report), J. R. McGrecor anp T. V. Narayana, University of Alberta. 

. “On Fitting a Linear Trend and Testing Independence when the Residuals Form a 
Markov Process,’’ V. K. Murtuy, Stanford University. 

5. “On the Asymptotic Normality and Independence of the Sample Partial Autocorrelations 
for an Autoregressive Process,’’ V. K. Murtuy, Stanford University. 

3}. “On Horvitz and Thompson’s T-Class Estimators,’’ S. G. PRaBHU-AJGAONKAR AND 
B. D. Trxkxiwat, Karnatak University, Dharwar (India). 

. “Comparing Distances between Multivariate Normal Populations, I,’’ (Preliminary 
Report), THEorHiLos Cacou..os, Columbia University. 

3. ‘‘Formulations of a Model Containing a Chance Mechanism According to which Observa- 
tions are Missed: The Randomized Block Design,’’ JuNs1rOo OGawa, Nihon Univer- 
sity, Tokyo, Japan AND BERNARD S. PasTeRNACK, Institute of Industrial Medicine, 
New York University Medical Center. 

. ‘Approximation for the Entropy of Functions of Markov Chains,’ Joun J. Brreu, 
University of Nebraska. 

. “On a Problem in Hilbert Space with Applications,’ J. R. BuumM anv D. L. Hanson, 
Sandia Corporation. 

. “Some Asymptotic Properties of the Negative Binomial Distribution,’’ Vivian PEssin, 
Children’s Hospital, Buffalo (Introduced by Norman C. Severo.) 

2. ‘The Effect of Convergence to Normality on Tests of Hypothesis,’’ Luoyp J. MonTzINGO 
AND NorMAN C. Severo, University of Buffalo. 

3. ‘Convergence to Normality of Functions of a Normal Random Variable,’’ Norman C. 

SEVERO AND Lioyp J. Montzinao, University of Buffalo. 
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. ‘Selecting the ‘Best’ t out of k populations,’ B. K. Buatracnarya, University of North 
Carclina (Introduced by 8. N. Roy). 
5. ‘‘Percentile Estimators for the Parameters of the Exponential Failure Law,’ Satya 
D. Dusery, Ivorydale Technical Center, Procter & Gamble Co., Cincinnati, Ohio. 
. “On Step-Down Procedures in Simultaneous Multivariate Analysis of Variance,’’ 
(Preliminary Report), P. R. Krisonaran, Remington Rand, UNIVAC. 


Sanne ooo 


CONTEMPORARY MATHEMATICS 
to be repeated on 
CONTINENTAL CLASSROOM 
in 1961-1962 


The National Broadcasting Company announces that last season’s ‘“‘Conti- 
nental Classroom” course in Contemporary Mathematics will be repeated on 
color tape recordings from 6 to 6:30 a.m., local time, beginning September 25, 
1961. 

Contemporary mathematics, the new course on “Continental Classroom”’ in 
1960-1961, is taught by John L. Kelley, Professor of Mathematics at the Uni- 
versity of California, Berkeley, and Frederick Mosteller, Professor of Mathe- 
matical Statistics and Chairman of the Department of Statistics at Harvard 
University. Professor Kelley teaches Modern Algebra during the first semester 
of Contemporary Mathematics; Professor Mosteller teaches Probability and 
Statistics during the second. 

The new course on “Continental Classroom” during 1961-1962 will be a 
course in American Government. 


The Conference Board of the Mathematical Sciences is one of the sponsors of 
Contemporary Mathematics; the others are Learning Resources Institute and 
the National Broadcasting Company 


Tl 


PUBLICATIONS RECEIVED 


Becker, R. M. anv Scott, W. Hazen, Jr., Particle Statistics of Infinite Populations as 
Applied to Mine Sampling, Bureau of Mines, Washington D.C., 1961. 78 pp. 45¢. 

Censo de Clases Pasivas del Estado en 1.° de Enero de 1958, Presidencia del Gobierno, In- 
stituto Nacional de Estadistica, Madrid, 1960. 35 pp. 

‘‘Employment of Scientific and Technical Personnel in State Government Agencies: Report 
on a 1959 Survey,” National Science Foundation, Washington D.C., 1961. 67 pp. 45¢. 

Nuclear Reactor Theory, Proceedings of Symposia in Applied Mathematics, Vol. XI, Ameri- 
can Mathematical Society, Providence, R.I., 1961. 339 pp. 

Or.eans, Leo A., Professional Manpower and Education in Communist China, National 
Science Foundation, Washington D.C., 1961. 260 pp. $2.00. 

Scuuttz, VINCENT, An Annotated Bibliography on the Uses of Statistics in Ecology—a Search 
of 31 Periodicals, United States Atomic Energy Commission, Office of Technical Infor- 
mation, Washington D.C. 315 pp. $3.00. 

SoLtomon, Herpert, ed., Studies in Item Analysis and Prediction, Stanford University 

Press, Stanford, California, 1961. 310 pp. $8.75. 
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The Passage Problem for a Stationary Markov Chain 


By J. H. B. Kemperman. Presents systematically a number of 
methods useful in studying the problems of first passage and ab- 
sorption in a Markov chain; in particular, methods for obtaining 
exact formulae for the probabilities under consideration or their 
moments. Numerous illustrations show adequately how each method 
serves as a natural tool for handling a large number of practical 
problems. $5.00 


Statistical Inference for Markov Processes 


By Patrick Billingsley. A general mathematical theory for the statis- 
tical problems of determining whether Markov models fit empirical 
data and of estimating any parameters upon which the models 
may depend. The applications which illustrate the mathematical 
results make the book useful to workers in the applied fields as 
well as to mathematicians, statisticians, and graduate students in 
statistics. $4.00 


UNIVERSITY OF CHICAGO PRESS 
5750 Ellis Avenue, Chicago 37, Illinois 


Announcing a new series of publications 
SELECTED TRANSLATIONS IN 
MATHEMATICAL STATISTICS AND PROBABILITY 
Volume I 


This volume contains 25 papers. Published for the Institute of 
Mathematical Statistics by the American Mathematical Society 


25% discount to members of IMS and AMS 
306 pages 


Orders for copies of Volume I and standing orders 
for this new series should be sent to the 


AMERICAN MATHEMATICAL SOCIETY 


190 Hope Street, Providence 6, R. I. 





PSAM Vol. Xil 
PROCEEDINGS OF THE SYMPOSIUM ON THE 
STRUCTURE OF LANGUAGE AND ITS 
MATHEMATICAL ASPECTS 


The twenty articles in this book are texts of addresses which were de- 
livered at the symposium held in April, 1960. 

The authors contributing papers to this book are: W. V. Quine; Noam 
Chomsky; Hilary Putnam; H. Hiz; Nelson Goodman; Haskell B. Curry; Yuen 
Ren Chao; Murray Eden; Morris Halle; Robert Abernathy; Hans G. Herz- 
berger; Anthony G. Oettinger; Victor H. Yngve; Gordon E. Peterson and 
Frank Harary; Joachim Lambek; H. A. Gleason, Jr.; Benoit Mandelbrot; 
Charles F. Hockett; Rulon Wells; Roman Jakobson. 

283 Pages 25% discount to members $7.80 


AMERICAN MATHEMATICAL SOCIETY 
190 Hope Street, Providence 6, Rhode Island 
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CONTENTS OF VOL. 23, PART 1, 1961 


Consistency in Statistical Inference and Decision. C. A. B. Smirn (With Discussion). 

Delays on a Two-Lane Road. J.C. TANNER 

A Queueing Model for Road Traffic Flow. A. J. Mititer (With discussion on the two papers 

An Unbiased Estimator for Powers of the Arithmetic Mean. G. J. GLasser 

A Bulk Service Queueing Problem with Variable Capacity. N. K. Jarswau 

A Simple Method of Trend Construction. C. E. V. Leser 

The Real Stable Characteristic Functions and Chaotic Acceleration. I. J. Goop 

Reply to Mr. Quenouille’s Comments about My Paper on Mixtures. H. Scuerrté 

An Asymptotic Efficiency of Daniels’s Generalised Correlation Coefficients. D. J. Far.re 

The Average Run Length of the Cumulative Sum Chart When a V-Mask is Used. K. W. Kemp 
The Time-Dependent Solution for an Infinite Dam with Discrete Additive Inputs. G. F. Yeo 
Confidence Limits for Multivariate Ratios. B. M. Bennetr 

Estimation of the Parameters of a Linear Functional Relationship. M. Dorrr ano J. GurRLAND 
A Note on Vacancies on a Line. F. Downton 

Inaccuracy and Inference. D. F. Kerrivce 

A Simple Congestion System with Incomplete Service. D. R. Cox 

A Test of Homogeneity for Ordered Variances. 8. E. Vincent 

The Solution of Queueing and Inventory Models by Semi-Markov Processes. A. J. FaBEens 

A Note on the Renewal Function when the Mean Renewal Life Time is Infinite. W. L. Smrrnu 
The Moment Generating Function of the Truncated Multi-Normal Distribution. G. M. Tauuis 


The Royal Statistical Society, 21, Bentinck Street, London, W. 











ECONOMETRICA 


Journal of the Econometric Society 


Contents of Vol. 29, No. 2 - April 1961 


HERBERT A. SIMON AND ALBERT ANDO: Aggregation of Variables in Dynamic Systems 

FRANKLIN M. FisHER: On the Cost of Approximate Specification in Simultaneous 
Equation Estimation 

Peter R. Fisk: The Graduation of Income Distributions 

Patrick Supres: Behavioristic Foundations of Utility 

H. O. Hartiey: Nonlinear Programming by the Simplex Method 


M. MorisHIMA AND F. Seton: Aggregation in Leontief Matrices and the Labour 
Theory of Value 


LEeIF JOHANSEN: A Note on “Aggregation in Leontief Matrices and the Labour Theory 
of Value’”’ 


A. L. Naar: A Note on the Residual Variance Estimation in Simultaneous Equa- 
tions 


YasusuKkE Murakami: A Note on the General Possibility Theorem of the Social 
Welfare Function 


LioneEL W. McKenzie: On the Existence of General Equilibrium: Some Corrections 
REPORT OF THE ST. LOUIS MEETING 

REPORT OF THE STANFORD MEETING 

BOOK REVIEWS 

LETTER TO THE EDITOR 

ANNOUNCEMENTS AND NOTES 





JOURNAL OF 
THE AMERICAN STATISTICAL ASSOCIATION 


Volume 56 September, 1961 Number 295 
On an Index of Quality Change Iram Adelman and Zvi Griliches 
Distributions of Correlation Coefficients in Economic Time Series 
Edward Ames and Stanley Reiter 
Estimating a Mixed-Exponential Response Law. . F. J. Anscombe 
A Note on the Exact Finite Sample Frequency Functions of Generalized Classical 
Linear Estimators in Two Leading Over-Identified Cases R. L. Basmann 
The Other Side of the Lower Bound. A Note with a Correction Joseph Berkson 
Ex Ante and Ex Post Data in Inventory Investment Murray Brown 
A Note on the Asymptotic Normality of the Mann-Whitney -Wilcoxon Statistic 
Jack Capon 
Multiple Linear Regression Analysis with Adjustment for Class Differences 
M. Davies 
A Class of Distributions Applicable to Accidents 
Carol B. Edwards and John Gurland 
Estimation of Locatiun and Scale Parameters in a Truncated Grouped Sech Square 
Distribution P. R. Fisk 
Fitting of Straight Lines and Prediction when both Variables are Subject to Error 
Maz Halperin 
The Use of Sample Ranges in Setting Exact Confidence Bounds for the Standard 
Deviation of a Rectangular Population H. Leon Harter 
A Comparison of Major United States Religious Groups mena Lazerwitz 
The Progress of the Score during a Baseball Game . R. Lindsey 
Further Comments on the ‘‘Final Re en of the Advisory sideline on Weather 
Control’. ; - wean and E. L. Scott 
Bias in Pseudo-Random Numbers....... ......Paul Peach 
Length of Confidence Intervals. . .. John W. Pratt 
A Note on Griffin’s Paper ‘‘Graphic C omputation of Tau as a Coefficient of Disarray”’ 
S. M. Shah 
A Model for Migration Analysis ...... Ralph Thomlinson 
For further information, please contact: 
AMERICAN STATISTICAL ASSOCIATION 
1757 K Street, N.W. Washington 6, D. C. 





ESTADISTICA 


Journal of the Inter American Statistical Institute 
Vol. XVIII, No. 69 December 1960 
CONTENTS 


E] Acuerdo Entre Consejo de la OEA y el IASI...............Tulo H. Montenegro 
Consideraciones Sobre la Educacién Estadistica.... ........Carlos E. Dieulefait 
Actividades del IASI en Materia de Educacién Estadistica. .....Horacio D’Ottone 
Estadistica: Journal of the IASI. oa ..++s.......Hlizabeth Phelps Dunn 
The Argentine Census of September 30, 1960. 7s eee Nathan Keyfitz 
E] Censo Argentino del 30 de Septiembre de 1960. ec ae .... Nathan Keyfitz 


Estudios Demogr4ficos Relativos a una Politica de Poblacién en los Pafses Latino- 
americanos (traduccién)............ ; Giorgio Mortara 


Tendencias Demogrdficas Recientes en el Nusvo Mundo—Uua Visién General (tra- 
duccién). edie .... Kingsley Davis 


Special Feature: Recent ‘Developments in the Work of the Dominion Bureau of 
Statistics. 


Legal Provisions. Institute Affairs. Statistical News. Publications. 
Published quarterly Annual subscription price $3.00 (U. S.) 
INTER AMERICAN STATISTICAL INSTITUTE 


Pan American Union 
Washington 6, D. C. 
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A Journal of Statistics 


for the Physical, Chemical and Engineering Sciences 


CONTENTS 


TECHNOMETRICS, Vol. 3. No. 3, August, 1961 


The 2*-? Fractional Factorial Designs G. E. P. Boz and J. 8S. Hunter; Reduced Designs of Resolution Five 
J.C. Whitwell and G. K. Morbey; Partial Confounding in Fractional Replication W. J. Youden; Finding 
New Fractions of Factorial Experimental Designs R. E. Fry; A Study of the Group Screening Method 
G. S. Watson; Missing Values in Response Surface Designs Norman R. Draper; Use of Tables of Percentage 
Points of Range and Studentized Range H. Leon Harter; The Optimum Allocation of Spare Components 
in Systems Donald F. Morrison; The Reliability of Components Exhibiting Cumulative Damage Effects 
George H. Weiss; An Analysis of Some Relay Failure Data from a Composite Exponential Population R. R. 
Prairie and B. Ostle; Applications of Truncated Distributions in Process Startups and Inventory Control 
H. Smith and D. W. Grace; Estimating the Poisson Parameter from Samples that Are Truncated on the Right 
A. Clifford Cohen, Jr.; A General Simulation Programme for Batch Chemical Plants J. Dyson, P. L. Gold- 
smith and J. S. M. Robertson; Book Reviews M. J. R. Healy and M. Stone; Statistical Programs for High 
Speed Computers Notices 





Technometrics is published quarterly in February, May, August, and November. To members of the 
American Statistical Association and the American Society for Quality Control the rate is $6.00. The 
annual non-member subscription rate is $8.00. Checks should be made payable to Technometrics and ad- 
dressed to Technometrice, Post Office Box 587, Benjamin Franklin Station, Washington 6, D. C. 
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